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Abstract 



We begin by reviewing our current understanding of massless particles with spin 1 and spin 
2 as mediators of long-range forces in relativistic quantum field theory. We discuss how a 
description of such particles that is compatible with Lorentz covariance naturally leads to 
a redundancy in the mathematical description of the physics, which in the spin-1 case is 
local gauge invariance and in the spin-2 case is the diffcomorphism invariance of General 
Relativity. We then discuss the Weinberg- Witten theorem, which further underlines the 
need for local invariance in relativistic theories with massless interacting particles that have 
spin greater than 1/2. 

This discussion leads us to consider a possible class of models in which long-range in- 
teractions are mediated by the Goldstone bosons of spontaneous Lorentz violation. Since 
the Lorentz symmetry is realized non-linearly in the Goldstones, these models evade the 
Weinberg- Witten theorem and could potentially also evade the need for local gauge invari- 
ance in our description of fundamental physics. In the case of gravity, the broken symmetry 
would protect the theory from having non-zero cosmological constant, while the composite- 
ness of the graviton could provide a solution to the perturbative non-renormalizability of 
linear gravity. 

This leads us to consider the phenomenology of spontaneous Lorentz violation and 
the experimental limits thereon. We find the general low-energy effective action of the 
Goldstones of this kind of symmetry breaking minimally coupled to the usual Einstein 
gravity and we consider observational limits resulting from modifications to Newton's law 
and from gravitational Cerenkov radiation of the highest-energy cosmic rays. We compare 
this effective theory with the "ghost condensate" mechanism, which has been proposed in 
the literature as a model for gravity in a Higgs phase. 

Next, we summarize the cosmological constant problem and consider some issues related 
to it. We show that models in which a scalar field causes the super-acceleration of the 
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universe generally exhibit instabilities that can be more broadly connected to the violation 
of the null-energy condition. We also discuss how the equation of state parameter w = pj p 
evolves in a universe where the dark energy is caused by a ghost condensate. Furthermore, 
we comment on the anthropic argument for a small cosmological constant and how it is 
weakened by considering the possibility that the size of the primordial density perturbations 
created by inflation also varies over the landscape of possible universes. 

Finally, we discuss a problem in elementary fluid mechanics that had eluded a definitive 
treatment for several decades: the reverse sprinkler, commonly associated with Feynman. 
We provide an elementary theoretical description compatible with its observed behavior. 
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Chapter 1 

Introduction 

est aliquid, quocumque loco, quocumque recessu 
unius sese dominum fecisse lacertae. 

— Juvenal, Satire III 

He thought he saw a Argument 
That proved he was the Pope: 
He looked again, and found it was 
A Bar of Mottled Soap. 
"A fact so dread," he faintly said, 
"Extinguishes all hope!" 

— Lewis Carroll, Sylvie and Bruno Concluded 

This dissertation is essentially a collection of the various theoretical investigations that 
I pursued as a graduate student and that progressed to a publishable state. It is difficult, 
a posteriori, to come up with a theme that will unify them all. Even the absurdly broad 
title that I have given to this document fails to account at all for Chapter |3 which concerns 
a long-standing problem in elementary fluid mechanics. Therefore I will not attempt any 
such artificial unification here. 

I have made an effort, however, to make this thesis more than collation of previously 
published papers. To that end, I have added material that reviews and clarifies the relevant 
physics for the reader. Also, as far as possible, I have complemented the previously published 
research with discussions of recent advances in the literature and in my own understanding. 

Chapter |21 in particular was written from scratch and is intended as a review of the 
relationship between massless particles, Lorentz invariance (LI), and local gauge invariance. 
In writing it I attempted to answer the charge half-seriously given to me as a first-year 
graduate student by Mark Wise of figuring out why we religiously follow the commandment 
of promoting the global gauge invariance of the Dirac Lagrangian to a local invariance in 
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order to obtain an interacting theory. Consideration of the role of local gauge invariance in 
quantum field theories (QFT's) with massless, interacting particles also helps to motivate 
the research described in Chapter |31 

Chapter |S1 brings up spontaneous Lorentz violation, which is the idea that perhaps the 
quantum vacuum of the universe is not a Lorentz singlet (or, to put it otherwise, that empty 
space is not empty). The idea that gravity might be mediated by the Goldstone bosons 
of such a symmetry breaking is attractive because it offers a possible solution to two of 
the greatest obstructions to a quantum description of gravity: the non-renormalizability of 
linear gravity, and the cosmological constant problem. 

The work described in Chapter |1] seeks to place experimental limits on how large spon- 
taneous Lorentz violation can be when coupled to ordinary gravity. This line of research is 
independent from the ideas of Chapter |31 and applies to a wide variety of models in which 
cosmological physics takes place in a background that is not a Lorentz singlet. 

Chapter El begins with a brief overview of the cosmological constant problem, one of 
the greatest puzzles in modern theoretical physics. The next three sections of that chapter 
concern original results that are connected to that problem. Section [5.21 in particular has 
applications beyond the cosmological constant problem, as it offers a theorem that helps 
connect the energy conditions of General Relativity (GR) with considerations of stability. 

All of this work concerns both QFT and GR, our two most powerful (though mutually 
incompatible) tools for describing the universe at a fundamental level. In Chapter we 
consider an amusing problem about introductory college physics that, surprisingly, had 
evaded a completely satisfactory treatment for several decades. 

1.1 Notation and conventions 

We work throughout in units in which h = c = 1. Electrodynamical quantities are given in 
the Heaviside-Lorentz system of units in which the Coloumb potential of a point charge q 
is 

$ = ^. 

47rr 

We also work in the convention in which the Fourier transform and inverse Fourier 



transform in n dimensions are 
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Lorentz 4-vectors are written as a; = {x^ , , x'^ , x^) , where is the time component 
and x^,x'^, and x^ are the x,y, and z space components respectively. Spatial vectors are 
denoted by boldface, so that we also write x = (x°, x). Unit spatial vectors are denoted by 
superscript hats. Greek indices such as ^, u, p, etc. are understood to run from to 3, while 
Roman indices such as k, etc. are understood to run from 1 to 3. Repeated indices are 
always summed over, unless otherwise specified. 

We take g^'^ to represent the full metric in GR, while rj^"^ = diag(+l, —1, —1, —1) is the 
Minkowski metric of flat space-time. Indices are raised and lowered with the appropriate 
metric. The square of a tensor denotes the product of the tensor with itself, with all the 
indices contracted pairwise with the metric. Thus, for instance, the d'Alembertian operator 
in flat spacetime is 



We define the Planck mass as Mpi = s/l/SnG, where G is Newton's constant. For linear 
gravity we expand the metric in the form (7^^^ = rj^'^ + Mpi^h^^ and keep only terms linear 
in /i. In Chapter 121 we will work in units in which Mpi = 1. Elsewhere we will show the 
factors of Mpi explicitly. 

We use the chiral basis for the Dirac matrices 



7^ 



where = (1, cr), u'^ = (1, — cr), and the (T*'s are the Pauli matrices 

1 y \ i 

All other conventions are the standard ones in the literature. 

In writing this thesis, I have used the first person plural ("we") whenever discussing 



1 a''] 














; 


\ a^' j 




I 
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scientific arguments, regardless of their authorship. I have used the first person singular 
only when referring concretely to myself in introductory of parenthetical material. I feel 
that this inconsistency is justified by the avoidance of stylistic absurdities. 
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Chapter 2 

Massless mediators 

Did he suspire, that light and weightless down 
perforce must move. 

— William Shakespeare, Henry IV, part ii, Act 4, Scene 3 

You lay down metaphysic propositions which infer universal consequences, 
and then you attempt to limit logic by despotism. 

— Edmund Burke, Reflections on the Revolution in France 

2.1 Introduction 

I have sometimes been asked by scientifically literate laymen (my father, for instance, who is 
a civil engineer, and my ophthalmologist) to explain to them how a particle like the photon 
can be said to have no mass. How would a particle with zero mass be distinguishable from 
no particle at all? My answer to that question has been that in modern physics a particle is 
not defined as a small lump of stuff (which is the mental image immediately conveyed by the 
word, as well as the non-technical version of the classical definition of the term) but rather 
as an excitation of a field, somewhat akin to a wave in an ocean. In that sense, masslessness 
means something technical: that the excitation's energy goes to zero when its wavelength 
is very long. I have then added that masslessness also means that those excitations must 
always propagate at the speed of light and can never appear to any observer to be at rest. 

Here I will attempt a fuller treatment of this problem. Much of the professional life of 
a theoretical physicist consists of ignoring technical difficulties and underlying conceptual 
confusion, in the hope that something publishable and perhaps even useful might emerge 
from his labor. If the theorist had to proceed in strictly logical order, the field would advance 
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very slowly. But, on the other hand, the only thing that can ultimately protect us from 
being seriously wrong is sufficient clarity about the basics. In modern physics, long-range 
forces (electromagnetism and gravity) are understood to be mediated by massless particles 
with spin i > 1. The description of such massless particles in quantum field theory (QFT) 
is therefore absolutely central to our current understanding of nature. 

Therefore, I have decided to use the opportunity afforded by the writing of this thesis to 
review the subject. My goals are to elucidate why a relativistic description of massless parti- 
cles with spin j > 1 naturally requires something like local gauge invariance (which is not a 
physical symmetry at all, but a mathematical redundancy in the description of the physics) 
and to clarify under what circumstances one might expect to evade this requirement. 

I shall conclude with a discussion of how these considerations apply to whether some 
of the major outstanding problems of quantum gravity could be addressed by considering 
gravity to be an emergent phenomenon in some theory without fundamental gravitons. 
Nothing in this chapter will be original in the least, but it will provide a motivation for 
some of the original work presented in Chapter |21 

2.1.1 Unbearable lightness 

In his undergraduate textbook on particle physics, David Griffiths points out that massless 
particles are meaningless in Newtonian mechanics because they carry no energy or momen- 
tum, and cannot sustain any force. On the other hand, the relativistic expression for energy 
and momentum: 

p^" = {E,p) =-fm{l,v) (2.1) 

allows for non-zero energy-momentum for a massless particle if 7 = (l — ^ 00, 

which requires \v\ — > 1. Equation (|2.1j) doesn't tell us what the energy- momentum is, but 
we assume that the relation p'^ = m? is valid for m = 0, so that a massless particle's energy 
E and momentum p are related by 

E = \p\ . (2.2) 

Griffiths adds that 



Personally I would regard this "argument" as a joke, were it not for the fact 
that [massless particles] are known to exist in nature. They do indeed travel at 
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> 




Figure 2.1: Feynman diagram for the scattering of two particles that interact through the exchange 
of a mediator. 

the speed of light and their energy and momentum are related by [Eq. (|2.2|1 ] 



The problem of what actually determines the energy of the massless particle is solved not 
by special relativity, but by quantum mechanics, via Planck's formula E = lo, where uj is an 
angular frequency (which is an essentially wave-like property). Thus massless particles are 
the creatures of QFT par excellence, because, at least in current understanding, they can 
only be defined as relativistic, quantum-mechanical entities. Like other subjects in QFT, 
describing massless particles requires arguments that would seem absurd were it not for the 
fact that they yield surprisingly useful results that have given us a handle on observable 
natural phenomena. 

We need massless particles because we regard interaction forces as resulting from the 
exchange of other particles, called "mediators." Figure 123 shows the Feynman diagram that 
represents the leading perturbative term in the amplitude for the scattering of two particles 
(represented by the solid lines) that interact via the exchange of a mediator (represented 
by the dashed line). We can calculate this Feynman diagram in QFT and match the result 
to what we would get in non-relativistic quantum mechanics from an interaction potential 
V{r) (see, e.g., Section 4.7 in 2 ). The result is 



where g is the coupling constant that measures the strength of the interaction and ^ is the 
mediator's mass. Therefore, a long-range force requires fj, = 0. In order to accommodate 
the observed properties of the long-range electromagnetic and gravitational interactions, we 
also need to give the mediator a on-zero spin. We will see that this is non-trivial. 




,2 



(2.3) 
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2.1.2 Overview 

In this chapter we shall first briefly review how one-particle states are defined in QFT 
and how their polarizations correspond to basis states in irreducible representations of the 
Lorentz group. We will emphasize the difference between the case when the mass m of the 
particle is positive and the case when it is zero. We shall proceed to use these tools to build a 
field A'^ that transforms as a Lorentz 4-vector, first for m > and then for m = 0. We shall 
conclude that the relativistic description of a massless spin-1 field requires the introduction 
of local gauge invariance. Similarly, we will point out how the relativistic description of a 
massless spin-2 particle that transforms like a two-index Lorentz tensor requires something 
like diffeomorphism invariance (the fundamental symmetry of GR). Our discussion of these 
matters will rely heavily on the treatment given in 

We will then seek to formulate a solid understanding of the meaning of local gauge 
invariance and diffeomorphism invariance as redundancies of the mathematical description 
required to formulate a relativistic QFT with massless mediators. To this end we will also 
review the Weinberg- Witten theorem (^) and conclude by considering how it might be 
possible to do without gauge invariance and evade the Weinberg- Witten theorem in an 
attempt to write a QFT of gravity without UV divergences. 

2.2 Polarizations and the Lorentz group 

We define one-particle states to be eigenstates of the 4-momentum operator and label 
them by their eigenvalues, plus any other degrees of freedom that may characterize them: 

Pf'\p,r) =p^'\p,r) . (2.4) 

Under a Lorentz transformation A that takes p to Ap, the state transforms as 

\p,r) ^U{A)\p,r) (2.5) 

where U{A) is a unitary operator in some representation of the Lorentz group. The 4- 
momentum itself transforms in the fundamental representation, so that 



C/t(A)P^C/(A) = A^^P' 



(2.6) 
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The 4-momentum of the transformed state is therefore given by 



P^'U{k)\p,r) = U{K) U\A)P''U{A) \p,r) = U{A)A''y \p,r) = {ApYU{A)\p,r) , (2.7) 



which imphes that U{A) \p, r) must be a hnear combination of states with 4-momentum Ap: 



U{A)\p,r) = ^c„'{p,A)\Ap,r') . 



(2i 



If the matrix Crr'ip, A) in Eq. ()2.8() . for some fixed p, is written in bfock-diagonal form, then 
each block gives an irreducible representation of the Lorentz group. We will call particles 
in the same irreducible representation "polarizations." The number of polarizations is the 
dimension of the corresponding irreducible representation.^ 



2.2.1 The little group 

For a particle with mass given by m = > 0, let us choose an arbitrary reference 4- 
momentum k such that fc^ = m^. Any 4-momentum with the same invariant norm can be 
written as 

p'^ = KipY.k'^ (2.9) 

for some appropriate Lorentz transformation K{p). 

Let us then define the "little group" as the group of Lorentz transformations / that 
leaves the reference k^^ invariant: 

= ■ (2.10) 
Then Eq. (|2.8|) can be approached by considering D„'{I) = Crr'{p = k, A = I) so that 

U{I)\k,r) = Y,D„'{I)\k,r') (2.11) 

r' 

and defining 1-particle states with other 4-momenta by: 

\p,r) = N{p)U{K{p))\k,r) , (2.12) 

^Notice that in this choice of language a Dirac fermion has four polarizations: the spin-up and spin-down 
fermion, plus the spin-up and spin-down antifermion. 
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where N{p) is a normalization factor. If we impose that 

{k', r'\k, r) = 6r'rS^{k' - k) (2.13) 

for states with 4-momentum k, then 

{p',r'\p,r) = N*{p')N{p){k',r'\u\K{p'))U{K{p))\k,r) 

= N*{p')N{p)Dr'riK-\p')K{p))6^{k' -k) . (2.14) 

Since the (5- function in the second hne vanishes unless k' = k, this implies that the overlap 
is zero unless p' = p, and the D matrix in Eq. 1)2.14(1 is therefore trivial: 

{p', r'\p, r) = \N{p)\^ 6r'rS\k' - k) . (2.15) 

We wish to rewrite Eq. ()2.15|) in terms of 5^{p' — p), to which we have argued it must be 
proportional. It is not difficult to show that d^p/p^ is a Lorentz-invariant measure when in- 
tegrating on the mass shell p^ = \/ p^ + ■ This implies that 5^{k' — k) = 5^{p' — p)p^/k^ 
and we therefore have that 

{p',r'\p,r) = \N{p)\^6r'r5Hp' -p)pVk'' ■ (2-16) 
Equation ()2.16|) naturally leads to the choice of normalization 

N{p) = ^/WJ^ . (2.17) 

2.2.2 Massive particles 

A massive particle will always have a rest frame in which its 4-momentum is k^^ = {m, 0, 0, 0) . 
This is, therefore, the natural choice of reference 4-momentum. It is easy to check that the 
little group is then SO (3), which is the subgroup of the Lorentz group that includes only 
rotations. 

The generators of SO (3) may be written as 



(2.18) 
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which are the angular momentum operators and which obey the commutation relation 

[ J\ J^] = ie^^'V^ . (2.19) 

The Lie algebra of 50(3) is the same as that of SU (2), because both groups look identical in 
the neighborhood of the identity. In quantum mechanics, the intrinsic angular momentum 
of a particle (its spin) is a label of the dimensionality of the representation of SU (2) that 
we assign to it. A particle of spin j lives in the 2j + 1 dimensional representation of SU{2). 

The generators of 50(1,3) may be written as 

JM'^ = i (x^'d" - x^d^") , (2.20) 
which are clearly anti-symmetric in the indices and which obey the commutation relation 



(2.21) 



We may write the six independent components of J^'^ as two three-component vectors: 



(2.22) 



where K is the generator of boosts and L is the generator of rotations. Using Eqs. 1)2.21(1 



and ()2.22() . one can immediately show that these satisfy the commutation relations: 



(2.23) 



Let us define two new 3- vectors: 



J± = -{L±iK) 



(2.24) 



Using Eq. (|2.23|) we can write their commutators as 



P P 



P P 



(2.25) 



That is, both J_|_ and J_ separately satisfy the commutation relation for angular mo- 
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mentum, and they also commute with each other. This means that we can identify all 
finite-dimensional representations of the Lorentz group SO{l,3) by pair of integer or half- 
integer spins that correspond to two uncoupled representations of SO{S). The 

Lorentz-transformation property of a left-handed Weyl fermion ipL corresponds to (1/2,0), 
while (0, 1/2) corresponds to the right-handed Weyl fermion ipR. A massive Dirac fermion 
corresponds to the representation (1/2,0) ® (0, 1/2). 

A Lorentz 4- vector (that is, a quantity that transforms under the fundamental repre- 
sentation of S0{3, 1)), corresponds to (1/2, 1/2). This indicates that it can be decomposed 
into a spin-1 and a spin-0 part, since 1/2 (gi 1/2 = 1 © 0. Or, to put it otherwise, a general 
Lorentz vector has four independent components, three of which may be matched to the 
three polarizations of a j = 1 particle and one to the single polarization of a j = particle. 



2.2.3 Massless particles 

Since a massless particle has no rest frame, the simplest reference 4-momentum is k = (1, 0, 0, 1) 
The corresponding little group clearly contains as a subgroup rotations about the 2;-axis. 
The little group can be parametrized as 



(2.26) 



where 



and 



/l 

COS 

— sin ( 

\0 

/ 



\ 

sin^ 

cos (f) 

1 



-c 



1 + C S r] 

5 10-6 

rj 1 —77 

V C 6 l-C ) 



(2.27) 



(2.28) 



with C = ((^^ + r]^) /2. 
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It can be readily checked that 



Ai6uvirpMS2,mru = MSi + h,m + mru , (2-29) 

which imphes that the Uttle group is isomorphic to the group of rotations (by an angle (j)) 
and translations (by a vector {6,7])) in two dimensions.^ Unlike SO(3), this group, 750(2), 
is not semi-simple, i.e., it has invariant abelian subgroups: the rotation subgroup defined 
by Eq. 1)2.27(1 and the translation subgroup defined by Eq. ()2.28() . 

This leads to the important consequence that massless one-particle states \p, r) can have 
only two polarization, called "helicities," given by the component of the angular momentum 
along its direction of motion. The physical reason for this is that only the angular momen- 
tum component associated with the rotations in Eq. H2.27() can define discrete polarizations. 
Helicities are Lorentz-invariant, unlike the polarizations of a massive particle. 

It is clear that massless particles in QFT are different from massive ones. It is possible to 
understand some of the properties of massless particles by considering them as massive and 
then taking the m — > limit carefully, but this discussion should make it apparent that this 
limiting procedure is fraught with danger. We shall explore this issue in the construction 
of the vector field. 



2.3 The vector field 

We seek a causal, free quantum field that transforms like a Lorentz 4- vector. By analogy 
to the procedure used to obtain free quantum fields with spin and 1/2 (see, e.g.. Chapters 
2 and 3 in 2 , or Sections 5.2 to 5.5 in ^), we start by writing 



A^{x) = 




(2.30) 



where the index r runs over the physical polarizations of the field, while a and a' are 

the creation and destruction operators for particles of the corresponding momentum and 

polarization that obey bosonic commutation relations, and = (^\/m? + p'^,p^ ■ 

Let K{p) be the Lorentz transformation (boost) that takes a particle of mass m from 

^The Lorentz transformation in Eq. 112.281 is, of course, not a physical translation. It just happens that 
the group of such matrices is isomorphic to the group of translations on the plane. 
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rest to a 4- momentum p. It can be shown that the measure d?p/p^ is Lorentz-invariant 
when integrating on the mass-shell p^ = m?. Since both and A^^ are Lorentz 4- vectors, 
we must have 

^'rip) = ^:^HpYA{^) . (2.31) 

Now consider the behavior of er (0) under an infinitesimal rotation. For our field A^{x) 
in Eq. (|2..S0j) to have a definite spin j, we must have that 

L^Xm = S^;yAO) , (2.32) 

where the three components of S^^^ are the standard spin matrices for spin j. Equation 
(|2.32|1 follows immediately from requiring e^(0) to transform under rotations as both a 
4- vector and as a spin-j object.^ 

For the rotation generators in the fundamental representation of S'0(l,3) we have: 

= {L% = = , (2.33) 

= i^,^ ■ (2.34) 

Therefore, for (L^)''^ = J^iiL'TpiL^fu, we have 

(L^o = {L')\ = {L'f, = ; {L')\ = 2rf, . (2.35) 
Meanwhile, recall that, for the spin matrices, 

(S(^) V =j(i + lMrr-' . (2.36) 

Using Eqs. and we therefore obtain that 

4(0) = ^^^^^e^O) ; j{j + 1)6^(0) = . (2.37) 



Equation 1)2. 37() . combined with Eq. (|2.31() . leaves us only two posibilities if the field 
A^{x) in Eq. (|2.3fl|) is to transform as a 4- vector: 



^It should perhaps also be pointed out that in Eq. 12.321 the indices /i, u in the left-hand side indicate 
components of the three matrices U defined in Eq. 12.221 . In Eq. 12. 2H v labeled the matrices themselves. 
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• Either j = and e*^(0) is the only non- vanishing component, 

• or j = 1 and the three e*(0)'s are the only non- vanishing components 

This agrees with the claim made at the end of the previous section, which we had based on 
1/2(^)1/2 = 1©0. Let us explore both possibilities. 

2.3.1 Vector field with j = 

For the j = case we can chose the conventionally normalized €"(0) = which, by 
Eq. (fTlTT]) gives 

^'(P) = ^P']l^ ■ (2-38) 

One can then compare the resulting form for A^{x) in Eq. (|2.30j) to the form for a free 
scalar field and conclude that this vector field has the form 

^M(x) = a^</.(x) (2.39) 

for (j){x) a free, Lorentz scalar field. Notice that as the field (j) has a single physical polar- 
ization, so also does A^, and that even though our construction of the vector field assumed 
an m > in Eq. (|2.3H) . the m ^ limit in this case is perfectly sensible.^ 

2.3.2 Vector field with j = 1 

Now consider the case where the vector field has j = 1. Following the popular convention 
we write 

e>;^^,iO)=T:r^{v^±^v'2) (2-40) 

and 

<=o(0) = /^/s • (2.41) 

We may check that the raising and lowering operators = S^^ it iS^^ act appropriately 
on these polarization vectors. For a plane- wave propagating along the i = 3 spatial direction, 
r = ±1 correspond to two transverse, circular polarizations of the vector field, while r = 
corresponds to the longitudinal polarization. 

*This kind of massless, spinless vector field will appear again in the discussion of the "ghost condensate" 
mechanism in Chapter |1| 
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We may rewrite the field A^^ in terms of polarization vectors that are mass-independent 
by introducing 

e^(0) = \/2^e^(0) (2.42) 
then we have that Eq. ()2.3U|) becomes 

^^(^) = / (^f)3/2 ^ E (eT(p)a.(p)e^^-^ + 6T(p)at(p)e-^-) , (2.43) 
where (p) = K{p)^u^'^{0). The field in Eq. 1)2.43(1 obeys the equation of motion 

(□ - m^) A^(x) = . (2.44) 

Notice also that 

p^eT(p) = p^K't(p)eT (0) = {K-\p)p)j';{0) = m^M = (2.45) 

implies that 

^f,A^' = . (2.46) 

In the limit m ^ the boost K{p) becomes the identity and (p) = Cr (0) for all p. 
The field then obeys both OA'^ = and d^A^^ = 0.^ The fact that there are complications 
in this limit is revealed by using Eq. ((2.31|) and the form of (0)'s to obtain 

n'^^(p) ^ Yl ~'r{pK{p) = v'"' + ^- (2.47) 

r=— 1 

Notice that H^^^p^ = 0, while li^^'^ky = for k ■ p = 0, which means H^'^ is a projection 
unto the space orthogonal to p^. Equation (|2.47l) clearly is not finite as m ^ 0. This will 
be a problem if we try to directly couple to anything in a Lorentz-invariant way, 

oc A^'j^ , (2.48) 

because then the rate at which yl^'s would be emitted by the interaction would be propor- 



^ Therefore taking the m ^ limit of the spin-1 vector field automatically gives us the massless field in 
the Lorenz gauge. 



tional to 
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(2.49) 



r 



which clearly diverges as m ^ unless we impose that p ■ (j) = 0. That is, in the presence 
of an interaction of the form Eq. (|2.48j) , we must require that the current to which the field 
couples be conserved, 



in order to avoid an infinite rate of emission. 

As emphasized earlier in this chapter, the spin of a massless particle must point either 
parallel or anti-parallel to its direction of propagation. These possibilities correspond to the 
longitudinal polarizations e^^.^^. A massless particle cannot have a longitudinal polarization 
eQ. The requirement of current conservation in Eq. (|2.5U|) ensures that the longitudinal 
polarization decouples from the current in the m ^ limit, so that it cannot be produced 
by the interaction in Eq. 1)2. 48p . 

2.3.3 Massless j = 1 particles 

Let us now try to construct a genuinely massless vector field with non-zero spin j. To 
that effect we adopt an arbitrary reference momentum k = (0, 0, 1) and a corresponding 
light-like reference 4-momentum k = (1,0,0,1). Let K{p) be now defined as the Lorentz 
transformation that takes a massless particle with reference momentum k to a general 
momentum p. We can write this transformation as the composition of a rotation (from the 
direction of k to the direction of p) followed by a boost along the direction of p that scales 
the magnitude. Then 



We now require that er{k) transform as both a massless particle with helicity r = itj 
and as a 4- vector. For rotations by an angle (p around the axis of k, we must have 




(2.50) 



e^f(p) = KiprXik) . 



(2.51) 



e-'^e^f(fc) = A{4>r,e';{k) 



(2.52) 



where A{(p)^iy is the Lorentz transformation matrix corresponding to the rotation, given in 
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Eq. H2.27|) . For Eq. H2.52() to be true of a general (j) in the j = 1 case, we must have 



e^i(/c) oc (0, l,±i,0) (2.53) 

and we might as well normalize this solution to match the 's in Eq. 1)2. 42|) . giving 

4i(^) = ^(0,l,±i,0) . (2.54) 

These are the same polarization vectors that we obtained previously in the m ^ limit of 
the massive vector field. 

But the little group for massless particles is larger than the 0(2) = U{\) group rep- 
resented by Eq. ()2.27|) , as was seen in Subsection 12.2.31 For our field to transform as a 
4-vector we would also require that 

e^f(fc) = A(5,7?)^,65:(fc) , (2.55) 

where A((5, rf)^^ was given in Eq. ()2.28|) . Plugging in the polarization 4- vectors in Eq. 1)2. 54() 
we can see immediately that this is impossible because, under the transformation A(5, //), 

6^^i(fc)-6^^i(fc) + ||^. (2.56) 

Thus we are forced to accept that the one-particle states of a massless spin-1 vector field 
are not Lorentz-covariant under the action of their little group, but only covariant up to 
a term proportional to the reference k^^. If we then construct the general states using Eq. 
(tTKB and 

^'^(^) = / (2^)3/2 E (e^r{p)ar{p)e'^-'' + 6$f(p)at(p)e-^-) (2.57) 

we see that we are forced to accept that A^{x) transforms under a general Lorentz trans- 
formation A as: 

A^{x) ^ K^^A^iKx) + 9^17(x, A) (2.58) 

where Vt is some function of the coordinates x and the parameters of the Lorentz transfor- 
mation A. 
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Equation ()2.58|) should, in my opinion, be regarded as a disaster. Massless spin-1 quan- 
tum fields, which we need in order to explain the observed properties of the electromagnetic 
interaction, are incompatible with one of the most sacred principles of modern physics: 
Lorentz covariance.^ It is not, however, an irretrievable disaster, and in fact there will be a 
rich silver lining to it. 

We can "save" Lorentz covariance by announcing that two fields related by the trans- 
formation 

A^' ^ A^' + a^^O (2.59) 

describe the same physics, so that the second term in Eq. (|2.58p becomes irrelevant.^ We 
can couple such an A^^ if the interaction is of the form Cint oc A^^j^ for a conserved current 
j^, because in that case the coupling is invariant under transformations of the form in Eq. 
(|2.59|) . Notice that this requirement on the coupling of A^ agrees with what we imposed 
earlier, by Eq. (|2.49)1 . in order to avoid an infinite rate of emission for the vector field in 
the m — > limit. 

It is easy to construct a genuinely Lorentz-covariant two-index field strength tensor that 
is invariant under Eq. H2.59() : 

F^,^ = d^A^ - d^A^ . (2.60) 

Lorentz-invariant couplings to this field strength would be gauge-invariant, but the presence 
of derivatives in Eq. 1)2. 60() means that the resulting forces must fall off faster with distance 
than an inverse-square law (i.e., they cannot be long-range forces). 

2.4 Why local gauge invariance? 

The Dirac Lagrangian for a free fermion, C = — m)'ip is invariant under the global 
U{1) gauge transformation ip — > e*"V- This global symmetry, by Noether's theorem, implies 

®This statement may seem peculiar in light of the fact that the Lorentz group was first discovered as the 
symmetry of the Maxwell equations of classical electrodynamics. But those equations are written in terms of 
the fields E and B. The scalar and vector potentials (yl" and A respectively) enter classical electrodynamics 
only as computational aids. It is quantum mechanics which requires a formulation in terms of A'^. 

''This irresistibly brings to my mind a scene from the Woody Allen movie comedy Bananas in which 
victorious rebel commander Esposito announces from the Presidential Palace that "from this day on, the 
official language of San Marcos will be Swedish... Furthermore, all children under 16 years old are now 16 
years old." 
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conservation of the current: 



(2.61) 



In the estabhshed model of quantum electrodynamics, this Lagrangian is transformed into 
an interacting theory by making the gauge invariance local: The phase a is allowed to be a 
function of the space-time point x. This requires the introduction of a gauge field with 
the the transformation property 



and the use of a covariant derivative Z)^ = dfi — iA^ instead of the usual derivative d^. 
This procedure automatically couples A'^ to the conserved current in Eq. ()2.61|) so that the 
coupling is invariant under transformations of the form Eq l|2.62j) . We than add a Lorentz- 
invariant kinetic term —F^^,/A for the field A'^. The generalization to non-abelian gauge 
groups is well known, as is the Higgs mechanism to break the gauge invariance spontaneously 
and give the field A^ a mass. 

This is what we are taught in elementary courses on QFT, but the question remains: 
Why do we promote a global symmetry of the free fermion Lagrangian to a local symmetry? 
Equation ()2.58|1 provides a deeper insight into the physical meaning of local gauge invariance: 
a massless particle, having no rest frame, cannot have its spin point along any axis other 
than that of its motion. Therefore, it can have only two polarizations. By describing it as 
a 4-vector, spin-1 field A^ (which has three polarizations) a mathematical redundancy is 
introduced. 

This redundancy is local gauge invariance. A field with local gauge symmetry is coupled 
to the conserved current of the corresponding global gauge symmetry in order to make the 
coupling locally gauge- invariant. The procedure described of promoting the global gauge 
symmetry to a local gauge invariance is therefore required in order to couple fermions in a 
Lorentz-invariant way via a long-range, spin-1 force. 

2.4.1 Expecting the Higgs 




A^ + d^a 



(2.62) 



Remarkably, local gauge invariance also comes to our aid in writing sensible QFT's for the 
short-range weak nuclear interaction. At low energies, this interaction is naturally described 
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as being mediated by massive, spin-1 vector fields. The Lagrangian for such a mediator must 
look like 

C = -\Fl, + \m^A^-A^J^' , (2.63) 

where is the current to which it couples. But in the case of the weak nuclear interaction 
this current is not conserved. At energy scales much higher than the m in Eq. ()2.63|) . we 
therefore expect the same problem we found in Subsection l2.3.2l of a divergent emission rate 
for the longitudinal polarization, unless other higher-derivative operators, which were not 
relevant at low energies, have come to our rescue. 

In the standard model of particle physics, the resolution of this problem is to make 
the mediators of the weak nuclear interaction gauge bosons, and then to break that gauge 
invariance spontaneously by introducing a scalar Higgs field with a non-zero VEV, thus 
giving the bosons the mass that accounts for the short range of the force they mediate. At 
high energies the gauge invariance is restored. The problematic longitudinal polarization 
disappears and is transmuted into the Goldstone boson of the spontaneously broken sym- 
metry. Since the Goldstone boson has no spin, it does not have the problem of a divergent 
rate of emission. This is the reason why many billions of dollars have been spent in the 
search for that yet-unseen Higgs boson, a search soon to come to a head with the turning 
on of the Large Hadron Collider (LHC) at CERN next year. 

2.4.2 Further successes of gauge theories 

Gauge theories as descriptions of the fundamental particle interactions have other very 
attractive attributes. It was shown by 't Hooft that these theories are always renormalizable, 
i.e., that the infinities that plague QFT's can all be absorbed into a redefinition of the bare 
parameters of the theory, namely the masses and the coupling constants Politzer 
(0) and, independently. Gross and Wilczek ([,7,), showed that the renormalization flow 
of the coupling constants in non-abelian gauge theories provides a natural explanation of 
the observed phenomenon of asymptotic freedom, whereby the nuclear interactions become 
more feeble at higher energies. 

It is also widely believed, though not strictly demonstrated, that QCD, the theory 
in which the strong nuclear force is mediated by the bosons of an SU (3) gauge theory, 
accounts for confinement, i.e., for the fact that the strongly interacting fermions (quarks) 
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never occur alone and can appear only in bound states that are singlets of SU{3). These 
successes illustrate what we meant when we said in Subsection 12 . 3 . 3l that having to accept 
local gauge symmetry was a disaster with a rich silver lining. For interesting accounts of 
the history of local gauge invariance in classical and quantum physics, see (51 

2.5 Massless j = 2 particles and diffeomorphism invariance 

We could repeat the sort of procedure used in Subsection 12 . 3. 31 in order to try to construct 
Lorentz-covariant h^'^ out of the two helicities of a j =2 massless field. This procedure 
would similarly fail, requiring us to accept the transformation rule: 

h^'^ix) ^ A^'pA'^^hP'' (Ax) + d^'e{x,A) + d''e{x,A) . (2.64) 

Saving Lorentz covariance would then require announcing that states related by a transfor- 
mation of the form 

^f.u ^ ^f,u ^ Qf^^u ^ Qu^^, (2.65) 

are physically equivalent. We can construct a four-index field strength tensor R^^pa invari- 
ant under Eq. (|2.65|) that is anti-symmetric in fi, anti-symmetric in p, a, and symmetric 
under exchange of the two pairs. But to accommodate a long-range force we would need to 
couple h^^^ to a quantity Q^^^ such that 

(9'^^) = . (2.66) 

This Q^^'^ is the stress-energy tensor obtained from translational invariance 

^ - , (2.67) 

through Noether's theorem.^ Invariance under Eq. 1)2. 65() corresponds to promoting the 
translational symmetry in Eq. (|2.67|) to a local invariance by letting be a function of x. 
It turns out that the theory constructed in this way matches linearized GR around a flat 
background with h^y being the graviton field. 

*If there were another conserved O''"', there would have to be another conserved 4- vector besides , 
namely p'^ = J d^xQ'"'^. Kinematics would then allow only forward collisions. 



23 

It is well known that one can reconstruct the full GR uniquely from linear gravity by 
a self-consistency procedure f jlUL IllL I12j ). Therefore a relativistic QFT in flat spacetime 
with a massless spin-2 particle mediating a long-range force essentially implies GR. In full 
GR the invariance under Eq. ()2.65|) is a consequence of the invariance of the theory under 
diff eomor phisms : 

x^' ^ x'^'ix) . (2.68) 

Remarkably, we may therefore think of diffeomorphism invariance as a redundancy required 
by the relativistic description of a massless spin-2 particle. 

2.6 The Weinberg- Witten theorem 

The Weinberg- Witten theorem^ rules out the existence of massless particles with higher spin 
in a very wide class of QFT's (^4 ). In their original paper, the authors present their elegant 
proof very succinctly. This review is longer than the paper itself, which may be justified 
by the importance of this result in further clarifying the need for local gauge invariance in 
relativistic theories that accommodate long-range forces such as are observed in nature. 

Let \p,±j) and \p',±j) be two one-particle, massless states of spin j, labelled by their 
light-like 4- momenta p and p' , and by their helicity (which we take to be the same for the 
two particles). We will be considering the matrix elements 

{p',±j\f\p,±j) ; {p',±j\Tf'''\p,±j) , (2.69) 

where is a conserved current (i.e., {j^) = 0) and T^^'^ is a conserved stress-energy 
tensor (i.e. 9^ {T^'^) = 0). 

2.6.1 The J > 1/2 case 

If we assume that the massless particles in question carry a non-zero conserved charge 
Q = f d^x J^, so that (suppressing the helicity label for now) 

Q\p)=q\p) , (2.70) 

^According to the authors, a less general version of their theorem was formulated earlier by Sidney 
Coleman, but was not published. 
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where g 7^ 0, then evidently 



{p'\Q\p) = q5Hp' -P) ■ (2.71) 



Meanwhile, we also have that 



{P'\Q\P) = I d\{p'\f{t,x)\p) = j Sx{v'\e'''-^f{me-^''-^\p) 

d3xe*(P'-P)-^(p'| j°(t,0) \p) = {2T:f5^{p'-p){p'\f{t,{)) \p) (2.72) 



so that combining Eqs. (|2.71l) and 1)2.72(1 gives 

lim(p'|/(f,0)|p> = -/-3 , (2.73) 
which, by Lorentz covariance, implies that 

^im(p'|j-(t,0)|p> = ^^0. (2.74) 
Notice that Eq. (|2.74() implies current conservation, because p^ = 0. 



For any light-like p and p' 

[p' + pf = 2{p' ■ p) = 2{\p'\ \p\ -p' ■p) = 2\p'\ \p\ (1 - cos 6*) > , (2.75) 

where 9 is the angle between the momenta. If 7^ 0, then [p' + p) is time-like and we can 
therefore choose a frame in which it has no space component, so that 

p=(|p|,p); p={\p\,-p) (2.76) 

(i.e., the two particles propagate in opposite directions with the same energy). In this frame, 
consider rotating the particles by an angle (f) around the axis of p: 



(2.77) 
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The Lorentz covariance of the matrix element of j'^ then imphes that 



e^^'^^{p',±j\f{t,0)\p,±j)=A{<pr,{p',±j\fit,0)\p,±j) , (2.78) 

where A{(p) is the Lorentz transformation corresponding to a rotation by an angle (p around 
the direction of p. But A(0) contains no Fourier components other than e^*"^ and 1, so 
Eq. (|2.78|) implies that the matrix elements vanish for j > 1/2. In the limit p' p, 
we then arrive at a contradiction with Eq. 1)2. 74() . Therefore no relativistic QFT with a 
conserved current can have massless spin-1 particles (either fundamental or composite) that 
have Lorentz-covariant spectra and are charged under the conserved current. 

2.6.2 The j > 1 case 

If the massless particles in question carry no conserved charge, we may still consider the 
matrix elements of the stress-energy tensor T^'^ . By the same kind of argument as in 
Subsection 12.6.11 

hm {p'\ T^'^it, 0)\p) = ^ . (2.79) 

Notice again that this stress-energy is conserved because p^ = 0. 

Then combining Eq. (|2.77|) with relativistic covariance implies that 

^±2^4>3 ^p'^ T^^u^t^ 0) 1^ ^ Hct^Y^K{<p)\ {p', ±j\ TP'^it, 0) \p , ±j) . (2.80) 

The fact that A((^) contains only the Fourier components e^"'^ and 1 then implies that 
the matrix elements must vanish for j > 1, contradicting Eq. ()2.79|) in the limit p' — > p. 
Therefore no relativistic QFT with a conserved stress-energy tensor can have massless spin-2 
particles (either fundamental or composite) that have Lorentz-covariant spectra. 

2.6.3 Why are gluons and gravitons allowed? 

Evidently, the Weinberg-Witten theorem does not forbid photons, because they carry no 
conserved charge. It also does not forbid the and Z bosons because they are massive. 
But the Standard Model contains charged, massless spin-1 particles (the gluons) as well 
as massless spin-2 particles (the gravitons). How is this possible? The resolution of this 
question helps to clarify the necessity for local gauge invariance. 



In a Yang-Mills theory, 
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Cyu = -^F;,^'^^^ + Z)^V) (2.81) 



the gauge-invariant current 

^ = (2.82) 

is not conserved, because it obeys the equation (ja ) = 0, rather than d^^ (ja) = 0. Fur- 
thermore, {ja) vanishes for one-particle gauge field states. Therefore considering the matrix 
elements of this ja between gauge boson states in Yang-Mills theory would avail us nothing 
because the limit in Eq. 1)2. 74() would be zero. 

What we actually want is a current that measures the flow of charge in the absence of 
matter (i.e., for the Yang-Mills bosons alone) and that is conserved in the sense {Ja) = 0: 

= -FrfcabAb, , (2.83) 

where the /'s are the structure constants of the gauge group. Conservation follows imme- 
diately from the equation of motion for Eq. 1)2. 81() . This is, in fact, the conserved current 
obtained through Noether's theorem from the global gauge invariance of Eq. (|2.81() without 
matter. But the current in Eq. (|2.83|) is obviously not gauge- invariant. Therefore, under 
the action of a Lorentz transformation A, 

J,^ ^ Af'.J^ + df^Cla (2.84) 

and it is not, consequently, Lorentz-covariant. If we tried making it Lorentz-covariant by 
introducing an unphysical extra polarization of the gauge boson, then the theorem would 
fail because the helicities would not be Lorentz- invariant, invalidating the choice-of- frame 
procedure used to arrive at Eq. 1)2. 77|) . 

To put this in another way, in a gauge theory the physical \p, zizj) states are actually 
equivalence classes, because two states related by a gauge transformation represent the same 
physics. A technical way of thinking about this is that the physical states are elements of 
the BRST cohomology (^13 ). Therefore, matrix elements such as those in Eq. (|2.69|) 
are only well-defined if the operator is BRST-closed, which requires the operator to be 
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gauge-invariant. It is well known that Yang- Mills theories do not allow the construction of 
gauge-invariant conserved currents. 

The case of the graviton is very closely analogous to that of the Yang-Mills bosons. In 
Einstein-Hilbert gravity 

S= j (fxy/^[R+C„,atter{(t>,^tI.(p,9lXu)] , (2.85) 

where the field (f) stands for all possible matter fields of any spin. The covariant stress-energy 
tensor 

obeys {T^'^) = rather than {T^^) = 0, and (T^*^) = for any state with only gravi- 
tational fields. What we want is therefore not T, but rather 

= 1^ ^ - 9^,R ■ (2.87) 

But recall that the Ricci scalar R contains not only the metric and its first derivatives, but 
also terms linear in its second derivatives. In order to define we therefore need to do the 
usual trick of integrating by parts and setting the boundary terms to zero in order to get 
rid of the second derivatives in R. This means that R is no longer a covariant scalar and 
therefore is not a covariant tensor, but rather a pseudotensor. 

It is well known that gravitational energy cannot be defined in a covariant way. For 
instance, the energy of gravity waves on a flat background is localizable only for waves trav- 
eling in a single direction, which is not a coordinate-invariant condition (see, for instance. 
Chapter 33 in 14 ). A general Lorentz transformation of the graviton fleld h^^ will destroy 
this condition. This means that the stress-energy pseudotensor Q^^^ for gravitons involves 
a field h'^^ that does not transform like a Lorentz tensor. Its matrix elements are therefore 
not Lorentz-covariant. Once again, if we attempt to remedy this by introducing unphysical 
extra polarizations of the gravitons, the Lorentz invariance of the helicity is lost. 

Otherwise stated, in a theory with diffeomorphism invariance like GR, the physical 
states are equivalence classes, because two states related by a coordinate transformation 
represent the same physics. The matrix elements in Eq. (|2.(i9j) are only well defined if the 
operator T^^ is BRST-closed, but GR admits no local BRST-closed operators, and thus 
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evades the Weinberg- Witten theorem. 

Notice that even in theories with a local symmetry, such as QCD or GR, the Weinberg- 
Witten theorem does rule massless particles of higher spin that carry a conserved charge 
associated with a symmetry that commutes with the local symmetry. For instance, the au- 
thors of point out that their result forbids QCD from having flavor non-singlet massless 
bound states with j > 1, since flavor symmetries commute with the SU{3) local gauge 
symmetry. Similarly, a j = 1 gauge theory cannot produce composite gravitons with 
Lorentz-covariant spectra, because translations in flat Minkowski space-time commute with 
the gauge symmetry. Gauge theories admit the conserved, Lorentz-covariant Belinfante- 
Rosenfeld stress-energy tensor 

2.6.4 Gravitons in string theory 

String theories have a massless spin-2 particle in their spectrum. This discovery killed the 
original versions of string theory as possible descriptions of the strong nuclear interaction 
(which was the context in which they had been proposed) and made modern string theory a 
candidate for a quantum theory of gravity (see, for instance. Chapter 1 in [16 J. The reason 
why this result does not violate the Weinberg- Witten theorem is that it is not possible to 
define a conserved stress-energy tensor in string theory. 

Consider a string propagating in a D-dimensional background space-time with metric 
Qab, where a,b = 0,1, . . . D — 1. If 5 is the action in the background, then 



is not well defined because a consistent string theory requires imposing superconformal sym- 
metry on the background, which in turn automatically requires gab to obey an equation of 
motion (at low energies this equation of motion corresponds to the Einstein field equation 
of GR). The functional derivative in Eo. 1)2. 88(1 cannot be defined because there is no con- 
sistent off-shell definition of the background action S: The exact equation of motion for gab 
in string theory does not come from extremizing the action with respect to the background 
metric, but rather from a constraint required for consistency. ^'^ 

In general, we expect that a theory with emergent diffeomorphism invariance would 
^"l thank John Schwarz for clarifying this point for me. 
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not have a stress-energy tensor. The reason is that in the low-energy effective action (i.e., 
in GR) the graviton couples to a stress-energy tensor which is not observable because it 
is not diffeomorphism-covariant. If the fundamental theory itself has no diffeomorphism 
invariance, then it should not have a stress-energy tensor at all (see |17|). 

2.7 Emergent gravity 

The Weinberg- Witten theorem can be read as the proof that massless particles of higher 
spin cannot carry conserved Lorentz-covariant quantities. Local gauge invariance and dif- 
feomorphism invariance are natural ways of making those quantities mathematically non- 
Lorentz-covariant without spoiling physical Lorentz covariance. It is possible and interesting 
nonetheless, to consider other ways of accommodating massless mediators with higher spin. 
Despite the successes of gauge theories, the fact remains that there is no clearly compelling 
a priori reason to impose local gauge invariance as an axiom, and that such an axiom has 
the unattractive consequence that it makes our mathematical description of physical reality 
inherently redundant (see, for instance. Chapter III. 4 in |18j^. 

Also, while local gauge invariance guarantees renormalizability for spin 1, it is well known 
that quantizing h^^, in linear gravity does not produce a perturbatively renormalizable 
field theory. One attractive solution to this problem would be to make the graviton a 
composite, low-energy degree of freedom, with a natural cutoff scale Auv The Weinberg- 
Witten theorem represents a significant obstruction to this approach, because the result 
applies equally to fundamental and to composite particles. Indeed, ruling out emergent 
gravitons was the authors' purpose for establishing that theorem. 

In a recent public lecture (^Hl)) Witten has made the strong claim that "whatever 
we do, we are not going to start with a conventional theory of non-gravitational fields in 
Minkowski spacetime and generate Einstein gravity as an emergent phenomenon." His 
reasoning is that identifying emergent phenomena requires first defining a box in 3-space 
and then integrating out modes with wavelengths shorter than the length of the edges of 
the box (see Fig. 12. 2|) . But Einstein gravity implies diffeomorphism invariance, and a 
general coordinate transformation spoils the definition of our box. Witten's conclusion is 
that gravity can be emergent only if the notion on the space-time on which diffeomorphism 
invariance operates is simultaneously emergent. This is a plausible claim, but it goes beyond 
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Figure 2.2: Schematic representation of Witten's argument that a general coordinate transformation 
spoils the box used to define the modes that are integrated out in order to identify the emergent 
low-energy physics for energy scales well below A uv- 

what the Weinberg- Witten theorem actually establishes. 

In 1983, Laughlin explained the observed fractional quantum Hall effect in two-dimensional 
electronic systems by showing how such a system could form an incompressible quantum 
fluid whose excitations have charge e/3 (^20J. That is, the low-energy theory of the inter- 
acting electrons in two spatial dimensions has composite degrees of freedom whose charge 
is a fraction of that of the electrons themselves. In 2001, Zhang and Hu used techniques 
similar to Laughlin's to study the composite excitations of a higher-dimensional system 
f|21j). They imagined a four-dimensional sphere in space, filled with fermions that interact 
via an SU{2) gauge field. In the limit where the dimensionality of the representation of 
SU{2) is taken to be very large, such a theory exhibits composite massless excitations of 
integer spin 1, 2 and higher. 

Like other theories from solid state physics, Zhang and Hu's proposal falls outside the 
scope of the Weinberg- Witten theorem because the proposed theory is not Lorentz-invariant: 
The vacuum of the theory is not empty and has a preferred rest-frame (the rest frame of 
the fermions). However, the authors argued that in the three-dimensional boundary of 
the four-dimensional sphere, a relativistic dispersion relation would hold. One might then 
imagine that the relativistic, three-dimensional world we inhabit might be the edge of a 
four-dimensional sphere filled with fermions. Photons and gravitons would be composite 
low-energy degrees of freedom, and the problems currently associated with gravity in the 
UV would be avoided. The authors also argue that massless bosons with spin 3 and higher 
might naturally decouple from other matter, thus explaining why they are not observed in 
nature. 
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In Chapter I^Jwe will discuss another proposal, dating back to the work of Dirac {'^TT) and 
Bjorken ( 23 ) for obtaining massless mediators as the Goldstone bosons of the spontaneous 
breaking of Lorentz violation. Such an arrangement evades the Weinberg-Witten theorem 
because the Lorentz invariance of the theory is realized non-linearly in the Goldstone bosons. 
Therefore the matrix elements in Eq. (|2.(i9j) will not be Lorentz-covariant. 
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Chapter 3 

Goldstone photons and gravitons 

In this chapter we will address some issues connected with the construction of models in 
which massless mediators are obtained as Goldstone bosons of the spontaneous breaking 
of Lorentz invariance (LI) . This presentation is based largely on previously published work 

[MlllS]- 

3.1 Emergent mediators 

In 1963, Bjorken proposed a mechanism for what he called the "dynamical generation of 
quantum electrodynamics" (QED) ([IS])- His idea was to formulate a theory that would 
reproduce the phenomenology of standard QED, without invoking local U{1) gauge invari- 
ance as an axiom. Instead, Bjorken proposed working with a self-interacting fermion field 
theory of the form 

C = ti;{i^ - m)ip - X{ti;-f^'ipf . (3.1) 

Bjorken then argued that in a theory such as that described by Eq. (|3.1|) . composite 
"photons" could emerge as Goldstone bosons resulting from the presence of a condensate 
that spontaneously broke LI. 

Conceptually, a useful way of understanding Bjorken's proposal is to think of it 
resurrection of the "lumineferous aether" ( j2f)[ I27j ) : "empty" space is no longer really empty. 
Instead, the theory has a non-vanishing vacuum expectation value (VEV) for the current 
= This VEV, in turn, leads to a massive background gauge field oc j^, as in the 

well-known London equations for the theory of superconductors (^28 ). Such a background 
spontaneously breaks Lorentz invariance and produces three massless excitations of (the 
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Goldstone bosons) proportional to the changes associated with the three broken Lorentz 
transformations. ^ 

Two of these Goldstone bosons can be interpreted as the usual transverse photons. 
The meaning of the third photon remains problematic. Bjorken originally interpreted it 
as the longitudinal photon in the temporal- gauge QED, which becomes identified with the 
Coulomb force (see also [211 )■ More recently, Kraus and Tomboulis have argued that the 
extra photon has an exotic dispersion relation and that its coupling to matter should be 
suppressed fpUj). 

Bjorken's idea might not seem attractive today, since a theory such as Eq. ()3.1|) is 
not renormalizable, while the work of 't Hooft and others has demonstrated that a lo- 
cally gauge-invariant theory can always be renormalized ( 5 ). Furthermore, as detailed in 
Section 12.41 the gauge theories have had other very significant successes. Unless we take 
seriously the line of thought pursued in Chapter [2 that local gauge invariance is suspect 
because it is a redundancy of the mathematical description rather than a genuine physical 
symmetry, there would not appear to be, at this stage in our understanding of fundamental 
physics, any compelling reason to abandon local gauge invariance as an axiom for writing 
down interacting QFT's.^ Furthermore, the arguments for the existence of a Ll-breaking 
condensate in theories such as Eq. (|3.H) have never been solid. ^ 

In 2002 Kraus and Tomboulis resurrected Bjorken's idea for a different purpose of greater 
interest to contemporary theoretical physics: making a composite graviton (,30J. They 
proposed what Bjorken might call "dynamical generation of gravity." In this scenario a 
composite graviton would emerge as a Goldstone boson from the spontaneous breaking of 
Lorentz invariance in a theory of self-interacting fermions. Being a Goldstone boson, such 
a graviton would be forbidden from developing a potential, thus providing a solution to the 
"large cosmological constant problem:" the Ah^^ tadpole term for the graviton would vanish 
without fine-tuning (see Section fS.lf) . This scheme would also seem to offer an unorthodox 
avenue to a renormalizable quantum theory of gravity, because the fermion self-interactions 

^In Bjorken's work, A'^ is just an auxiliary or interpolating field. Dirac had discussed somewhat similar 
ideas in |22j . but, amusingly, he was trying to write a theory of electromagnetism with only a gauge field and 
no fundamental electrons. In both the work of Bjorken and the work of Dirac, the proportionality between 
A'^ and j'^ is crucial. 

■^According to Mark Wise, though, in the 1980's Feynman considered Bjorken's proposal as an alternative 
to postulating local gauge invariance. 

^For Bjorken's most recent revisiting of his proposal, in the light of the theoretical developments since 
1963, see 
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could be interpreted as coming from the integrating out, at low energies, of gauge bosons 
that have acquired large masses via the Higgs mechanism, so that Einstein gravity would 
be the low energy behavior of a renormalizable theory. This proposal would, of course, 
radically alter the nature of gravitational physics at very high energies. Related ideas had 
been previously considered in, for instance, |31j . 

In ISHj , the authors consider fermions coupled to gauge bosons that have acquired masses 
beyond the energy scale of interest. Then an effective low-energy theory can be obtained 
by integrating out those gauge bosons. We expect to obtain an effective Lagrangian of the 
form 



n=l 

- 2n 

+ ... , (3.2) 



oo 



n=l 



where we have explicitly written out only two of the power series in fermion bilinears that 
we would in general expect to get from integrating out the gauge bosons. 

One may then introduce an auxiliary field for each of these fermion bilinears. In this 
example we shall assign the label to the auxiliary field corresponding to tp^^ip, and 
the label h'^'^ to the field corresponding to V'§(7^ dy —7^ dy)ilj. It is possible to write 
a Lagrangian that involves the auxiliary fields but not their derivatives, so that the cor- 
responding algebraic equations of motion relating each auxiliary field to its corresponding 
fermion bilinear make that Lagrangian classically equivalent to Eq. (|3.2|) . In this case the 
new Lagrangian would be of the form 

C' = {r|^^•^ + h''^^)i;'-{J^^,-J^^,)^|J-^|^{4 + m)^+... 

-VA{A^)-Vh{h^) + ... , (3.3) 

where = A^A^ and h? = h^yh^^'^ . The ellipses in Eq. 1)3. 3() correspond to terms with 
other auxiliary fields associated with more complicated fermion bilinears that were also 
omitted in Eq. 

We may then imagine that instead of having a single fermion species we have one very 
heavy fermion, Vij and one lighter one, V2. Since Eq. (|3.3() has terms that couple both 
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fermion species to the auxiliary fields, integrating out ipi will then produce kinetic terms 
for and h^^'' . 

In the case of we can readily see that since it is minimally coupled to ipi, the kinetic 
terms obtained from integrating out the latter must be gauge-invariant (provided a gauge- 
invariant regulator is used). To lowest order in derivatives of A^, we must then get the 
standard photon Lagrangian —F'^^/A. Since A^ was also minimally coupled to ■02) we then 
have, at low energies, something that has begun to look like QED. 

If A^ has a non-zero VEV, LI is spontaneously broken, producing three massless Gold- 
stone bosons, two of which may be interpreted as photons (see [SO] for a discussion of how 
the exotic physics of the other extraneous "photon" can be suppressed). The integrating 
out of V'l and the assumption that h^^'^ has a VEV, by similar arguments, yield a low-energy 
approximation to linearized gravity. 

Fermion bilinears other than those we have written out explicitly in Eq. (|3.2j) have their 
own auxiliary fields with their own potentials. If those potentials do not themselves produce 
VEV's for the auxiliary fields, then there would be no further Goldstone bosons, and one 
would expect, on general grounds, that those extra auxiliary fields would acquire masses of 
the order of the energy-momentum cutoff scale for our effective field theory, making them 
irrelevant at low energies. 

The breaking of LI would be crucial for this kind of mechanism, not only because we 
know experimentally that photons and gravitons are massless or very nearly massless, but 
also because it allows us to evade the Weinberg- Witten theorem (0), as we discussed in 
Section 12.71 

Let us concentrate on the simpler case of the auxiliary field A^^. For the theory described 
by Eq. H3.3() . the equation of motion for A^^ is 

BC 

__ = _ v'i^A^) . 2A^ = 0. (3.4) 

Solving for ^j^ip in Eq. (|3.4j) and substituting into both Eq. 1)3. 2|) and Eq. (|3.3p we see 
that the condition for the Lagrangians C and C to be classically equivalent is a differential 
equation for V{A'^) in terms of the coefficients A^: 

oo 

V{A^) = 2A'^[V'{A^)] - ^ A„22"A2"[y'(A2)]2". (3.5) 

n=l 
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It is suggested in [301 that for some values of the resulting potential V{A'^) might 
have a minimum away from = 0, and that this would give the Ll-breaking VEV needed. 
It seems to us, however, that a minimum of V{A'^) away from the origin is not the correct 
thing to look for in order to obtain LI breaking. The Lagrangian in Eq. 1)3. 3|) contains 
A^^s not just in the potential but also in the "interaction" term A^tp^^ip, which is not in 
any sense a small perturbation as it might be, say, in QED. In other words, the classical 
quantity V{A^) is not a useful approximation to the quantum effective potential for the 
auxiliary field. 

In fact, regardless of the values of the A„,, Eq. (|3.5|) implies that V{A^ = 0) = 0, and also 
that at any point where V'{A^) = the potential must be zero. Therefore, the existence 
of a classical extremum at = C 7^ would imply that V{C) = 1^(0), and unless the 
potential is discontinuous somewhere, this would require that V (and therefore also V) 
vanish somewhere between and C, and so on ad infinitum. Thus the potential V cannot 
have a classical minimum away from = 0, unless the potential has poles or some other 
discontinuity. 

A similar observation applies to any fermion bilinear for which we might attempt this 
kind of procedure and therefore the issue arises as well when dealing with the proposal in 
j3Uj for generating the graviton. It is not possible to sidestep this difficulty by including 
other auxiliary fields or other fermion bilinears, or even by imagining that we could start, 
instead of from Eq. ()3.2() . from a theory with interactions given by an arbitrary, possibly 
non-analytic function of the fermion bilinear F(bilinear). The problem can be traced to 
the fact that the equation of motion of any auxiliary field of this kind will always be of the 
form 

= - (bilinear) - y'(field^) • 2 field. (3.6) 

The point is that the vanishing of the first derivative of the potential or the vanishing 
of the auxiliary field itself will always, classically, imply that the fermion bilinear is zero. 
Classically at least, it would seem that the extrema of the potential would correspond to 
the same physical state as the zeroes of the auxiliary field. 
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3.2 Nambu and Jona-Lasinio model (review) 



The complications we have discussed that emerge when one tries to implement LI breaking 
as proposed in jSO] do not, in retrospect, seem entirely surprising. A VEV for the auxiliary 
field would classically imply a VEV for the corresponding fermion bilinear, and therefore a 
trick such as rewriting a theory in a form like Eq. (|3.3() should not, perhaps, be expected 
to uncover a physically significant phenomenon such as the spontaneous breaking of LI for 
a theory where it was not otherwise apparent that the fermion bilinear in question had a 
VEV. Let us therefore turn our attention to considering what would be required so that 
one might reasonably expect a fermion field theory to exhibit the kind of condensation that 
would give a VEV to a certain fermion bilinear. 

If we allowed ourselves to be guided by purely classical intuition, it would seem likely 
that a VEV for a bilinear with derivatives (such as V^flT^t —7^ du)i^) might require non- 
standard kinetic terms in the action. Whether or not this intuition is correct, we abandon 
consideration of such bilinears here as too complicated. 

The simplest fermion bilinear is, of course, ipil^. Being a Lorentz scalar, (ipip) ^ will 
not break LI. This kind of VEV was treated back in 1961 by Nambu and Jona-Lasinio, 
who used it to spontaneously break chiral symmetry in one of the early efforts to develop 
a theory of the strong nuclear interactions, before the advent of quantum chromo dynamics 
(QCD) ( [A2\ ) . It might be useful to review the original work of Nambu and Jona-Lasinio, 
as it may shed some light on the study of the possibility of giving VEV's to other fermion 
bilinears that are not Lorentz scalars. 

In their original paper, Nambu and Jona-Lasinio start from a self-interacting massless 
fermion field theory and propose that the strong interactions be mediated by pions, which 
appear as Goldstone bosons produced by the spontaneous breaking of the chiral symmetry 
associated with the transformation ip 1— > exp {iaj^)ip. This symmetry breaking is produced 
by a VEV for the fermion bilinear ipip. In other words, Nambu and Jona-Lasinio originally 
proposed what, by close analogy to Bjorken's idea, would be the "dynamical generation of 
the strong interactions."^ 

Nambu and Jona-Lasinio start from a non-renormalizable quantum field theory with a 



^Historically, though, Bjorken was motivated by the earlier work of Nambu and Jona-Lasinio. 
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Figure 3.1: Diagrammatic Schwinger-Dyson equation. The double line represents the primed prop- 
agator, which incorporates the self-energy term. The single line represents the unprimed propagator. 
IPI' stands for the sum of one-particle irreducible graphs with the primed propagator. 

four-fermion interaction that respects chiral symmetry: 

£ = - \Bl^i^f - (V^tVV')']. (3.7) 

In order to argue for the presence of a chiral symmetry-breaking condensate in the 
theory described by Eq. (|3.7|) . Nambu and Jona-Lasinio borrowed the technique of self- 
consistent field theory from solid state physics (see, for instance, j33} ) . If one writes down 
a Lagrangian with a free and an interaction part, C = Cq + Ci^ ordinarily one would then 
proceed to diagonalize Cq and treat Ci as a perturbation. In self-consistent field theory one 
instead rewrites the Lagrangian as £ = (£o + £s) + (£« — £s) = -^o + where Cg is a 
self-interaction term, either bilinear or quadratic in the fields, such that £q yields a linear 
equation of motion. Now £g is diagonalized and £• is treated as a perturbation. 

In order to determine what the form of Cg is, one requires that the perturbation C'^ not 
produce any additional self-energy effects. The name "self-consistent field theory" reflects 
the fact that in this technique Ci is found by computing a self-energy via a perturbative 
expansion in fields that already are subject to that self-energy, and then requiring that such 
a perturbative expansion not yield any additional self-energy effects. 

Nambu and Jona-Lasinio proceed to make the ansatz that for Eq. 1)3. 7() the self- 
interaction term will be of the form Cg = —miljip. Then, to first order in the coupling 
constant 5, they proceed to compute the fermion self-energy S'(j)), using the propagator 
S'{p) = i(|i— m)~-^, which corresponds to the Lagrangian £g = 'il){i^—ni)ilj that incorporates 
the proposed self-energy term. 

The next step is to apply the self-consistency condition using the Schwinger-Dyson 
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Figure 3.2: Diagrammatic equation for the primed self-energy. We will work to first order in the 
fermion self-coupling constant g. 



equation for the propagator 

S'{x -y) = S{x-y) + J d^z S{x - z)S'(0)5'(z - y) , (3.8) 

which is represented diagrammatically in Fig. 13.11 The primes indicate quantities that 
correspond to a free Lagrangian Cq that incorporates the self-energy term, whereas the 
unprimed quantities correspond to the ordinary free Lagrangian Cq. For S' we will use the 
approximation shown in Fig. 13.21 valid to first order in the coupling constant g. 

After Fourier transforming Eq. 1)3. 8|) and summing the left side as a geometric series, 
we find that the self-consistency condition may be written, in our approximation, as 

m = S'(0) = ^ / ^-^^f . (3.9) 

If we evaluate the momentum integral by Wick rotation and regularize its divergence 
by introducing a Lorentz-invariant energy-momentum cutoff < we find 



27r^m 



m 



, / A^ 



A^ ^ i m? ~^ 



(3.10) 



This equation will always have the trivial solution m = 0, which corresponds to the 
vanishing of the proposed self-interaction term £j. But if 

27r2 

< < 1 (3.11) 

then there may also be a non-trivial solution to Eq. (|3.1fl|) . i.e., a non-zero m for which the 
condition of self-consistency is met. For a rigorous treatment of the relation between non- 
trivial solutions of this self-consistent equation and local extrema in the Wilsonian effective 
potential for the corresponding fermion bilinears, see |39j and the references therein. 
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In this model (which from now on we shah refer to as NJL), we see that if the interaction 
between fermions and antifermions is attractive {g > 0) and strong enough (|^ < 1) it 
might be energetically favorable to form a fermion-antifermion condensate. This is reason- 
able to expect in this case because the particles have no bare mass and thus the energy 
cost of producing them is small. The resulting condensate would have zero net charge, 
as well as zero total momentum and spin. Therefore it must pair a left-handed fermion 
ipL = ^(l — 7^)V' with the antiparticle of a right-handed fermion -i/^/j = ^{l + "f^)'ip, and vice 
versa. This is the mass-term self-interaction £j = —rmjjip = — m('02,V'i? + i^Ri^L) that NJL 
studies. 

After QCD became the accepted theory of the strong interactions, the ideas behind the 
NJL mechanism remained useful. The u and d quarks are not massless (nor is u-d flavor 
isospin an exact symmetry) but their bare masses are believed to be quite small compared to 
their effective masses in baryons and mesons, so that the formation of uu and dd condensates 
represents the spontaneous breaking of an approximate chiral symmetry. Interpreting the 
pions (which are fairly light) as the pseudo-Goldstone bosons generated by the spontaneous 
breaking of the approximate SU{2)ii x SU{2)l chiral isospin symmetry down to just SU{2), 
proved a fruitful line of thought from the point of view of the phenomenology of the strong 
interaction.^ 

Condition Eq. 1)3. 11(1 has a natural interpretation if we think of the interaction in Eq. 
H3.7|) as mediated by massive gauge bosons with zero momentum and coupling e. For it 
to be reasonable to neglect boson momentum in the effective theory, the mass fi of the 
bosons should be ^ > A. If < 27r^ then g = e^/fi^ < 27r^/A^, which violates Eq. (|3.11jl . 
Therefore for chiral symmetry breaking to happen, the coupling e should be quite large, 
making the renormalizable theory nonperturbative. This is acceptable because the factor 
of 1/^^ allows the perturbative calculations we have carried out in the effective theory Eq. 
H3.7|) . This is why the NJL mechanism is modernly thought of as a model for a phenomenon 
of non-perturbative QCD. 



^For a treatment of this subject, including a historical note on the influence of the NJL model in the 
development of QCD, see Chap. 19, Sec. 4 in |38|. 
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3.3 An NJL-style argument for breaking LI 



We have reviewed how NJL formulated a model that exhibited a non-zero VEV for the 
fermion bilinear 'ip^. The next simplest fermion bilinear that we might consider is ^^^^ip, 
which was the one that Bjorken, Kraus, and Tomboulis considered when they discussed the 
"dynamical generation of QED." This particular fermion bilinear is especially interesting 
because it corresponds to the U{1) conserved current, and also because it is the simplest 
bilinear with an odd number of Lorentz tensor indices, so that a non-zero VEV for it would 
break not only LI but also charge (C), charge-parity (CP), and charge-parity-time (CPT) 
reversal invariance. C and CP may not be symmetries of the Lagrangian, as indeed they 
are not in the standard model, but by a celebrated result CPT must be an invariance of any 
reasonable theory (see j41j and references therein). This invariance, however, may well be 
spontaneously broken, as it would be by any VEV with an odd number of Lorentz indices. 

Before proceeding, however, it may be advisable to try to develop some physical intuition 
about what would be required for a fermion bilinear like ipj^ip to exhibit a VEV. If we 
choose a representation of the gamma matrix algebra and use it to write out (tp^'^ilj)'^ for 
an arbitrary Dirac bispinor ip, we may check that (ip^'^ip)'^ > for the choice of mostly 
negative metric g^'^ = diag(l, —1, —1, —1). That is, ipj^ijj is time-like. This has an intuitive 
explanation, based on the observation that Tp^^tj) is a conserved fermion-number current 
density. Classically a charge density p moving with a velocity v will produce a current 

= {p, pv) (in units of c = 1). Therefore the relativistic requirement that the charge 
density not move faster than the speed of light in any frame of reference implies that 
j'^ > 0. Considerations of causality make it natural to expect that something similar would 
be true of tp^^ip. 

For any time-like Lorentz vector n'^ it is possible to find a Lorentz transformation that 
maps it to a vector n'^ with only one non-vanishing component: n'^ . For a constant current 
density j^, this means that for to be non-zero there must be a charge density , which 
has a rest frame. Therefore we only expect to see a VEV for ipj^^ip if our theory somehow has 
a vacuum with a non-zero fermion number density. The consequent spontaneous breaking 
of LI may be seen as the introduction of a preferred reference frame: the rest frame of the 
vacuum charge. 

In the literature of finite density quantum field theory and of color superconductivity 
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Figure 3.3: Fermion and antifermion energies in QFT, at zero density (left) and at finite density 
(right). Finite density introduces a chemical potential term — / • ■07°-0 into the fermion Lagrangian. 

(see, for instance, [H^ and [SHI)) the Lagrangians discussed are explicitly non-Lorentz- 
invariant because they contain chemical potential terms of the form / • tpj^ip . This term 
appears in theories whose ground state has a non-zero fermion number because, by the 
Pauli exclusion principle, new fermions must be added just above the Fermi surface, i.e., 
at energies higher than those already occupied by the pre-existing fermions, while holes 
(which can be thought of as antifermions) should be made by removing fermions at that 
Fermi surface. The result is an energy shift that depends on the number of fermions already 
present and which has opposite signs for fermions and antifermions, as illustrated in Fig. 

The physical picture that emerges is now, hopefully, clearer: A theory with a VEV for 
tp'j^ip is one with a condensate that has non-zero fermion number. This means that only 
theories with some form of attractive interaction between particles with the same sign in 
fermion number may be expected to produce such a VEV. The situation is closely analogous 
to BCS superconductivity ([10]), in which a phonon- mediated attractive interaction between 
electrons allows the presence of a condensate with non-zero electric charge. Note that in 
the NJL model, the condensate was composed of fermion-antifermion pairs, and therefore 
clearly {■tpj^tp) = 0, which implies {'ip^^ip) = 0. It should now be clear why a VEV for -tpj^^ij; 
would break not only LI but also C, CP, and CPT. This picture also helps to clarify the 
nature of the Goldstone bosons that we will be invoking as mediators of the electromagnetic 
interaction: They are density waves in the background "Dirac sea," whose energy at infinite 
wavelengths vanishes because they are then proportional to the broken boosts. 
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There is an easy way to write a theory that will have a VEV for a U{1) conserved 
current: to couple a massive photon to such a current via a purely imaginary charge. To 
see this, let us write a Proca Lagrangian for a massive photon field with an external source: 

^ = -\F'u + Y^'-j,A^- (3.12) 
The equation of motion for the photon field is 

^^i^^" = f - ^^Ar (3.13) 

At energy scales well below the photon mass fi, the kinetic term —F^j^/A may be ne- 
glected with respect to the mass term We may then integrate out the photon at 
zero momentum by solving the equation of motion Eq. 1)3. 13(1 for the photon field with 
its conjugate momenta F^'^ set to zero, and substituting the result back into the Lagrangian 
in Eq. (|3.12j) . The resulting low-energy effective field theory has the Hamiltonian 




(3.14) 



Nothing interesting happens if the source is a timelike current density, since in that case 
Eq. (|3.14|) has its minimum at = 0. But if we were to make the charge coupling to the 
photon imaginary (e.g., j'^ = ieip^'^Tp for e real), then is actually always negative (recall 
that (ip'y'^ijj)'^ is always positive) and we get a "potential" with the wrong sign, so that the 
energy can be made arbitrarily low by decreasing j'^. If we make j'^ dynamical by adding to 
the Lagrangian terms corresponding to the field that sets up the current, we might expect, 
for certain parameters in the theory, that the energy be minimized for a finite value of j^. 

By making the charge purely imaginary, our effective theory at energy scales much 
lower than the photon mass fi will look similar to Eq. (|3.7j) . except that the four-fermion 
interaction in the effective Lagrangian will be e'^(V'7'^^)^/2/u^ (with an overall positive, 
rather than a negative, sign). What this means is that fermions are attracting fermions and 
antifermions are attracting antifermions, rather than what we had in NJL (and in QED): 
attraction between a fermion and an antifermion. Condensation, if it occurs, will here 
produce a net fermion number, spontaneously breaking C, CP, and CPT.^ 

^Dyson argued that a theory with a long-range attraction between particles of the same fermion number 
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Figure 3.4: The four-fermion vertex in the self-interacting theory may be seen as the sum of two 
photon-mediated interactions with a massive photon that carries zero momentum and is coupled to 
the fcrmion via a purely imaginary charge. 

Let us analyze this situation again more rigorously using self-consistent field theory 
methods, following Nambu and Jona-Lasinio. For this we consider a fermion field with the 
usual free Lagrangian Cq = — mQ)ip and pose as our self-consistent ansatz: 

JZs = —{m — mo)V'V' — (3.15) 

The corresponding momentum-space propagator for £q = £o + is, therefore, 

5'(A;) = i(^-/7°-m)-^ (3.16) 

Now let us suppose that the interaction term looks like 

A = \{i>l^^f. (3.17) 

To obtain the Feynman rules corresponding to Eq. (|3.17)) we note that this is what 
we would obtain in massive QED if we replaced the charge e by ie and the usual photon 
propagator by ig'^'^/fj?, with g = e'^/fi'^. Therefore to compute the self-energy we will rely 
on the identity represented in Fig. 13.41 (In QED the second diagram on the right-hand side 
of Fig. 13.41 would vanish by Furry's theorem, but in our case the propagator in the loop 
will have a chemical potential term that breaks the C invariance on which Furry's theorem 
depends.) 



would be unstable and used this to suggest that perturbative series in QED would diverge after renormaliza- 
tion of the charge and mass As we will see at the end of this section, the "photon" mass /i will prevent 
the instability in our case. 



45 

To leading order in g, the self-energy is 

f d^k 3{ko - /)7° + 3kiY -2m , , 

where a (a function of \k\, f, and m) takes values ±1 so as to enforce the standard Feynman 
prescription for shifting the k^ poles: positive k^ poles are shifted down from the real line, 
while negative poles are shifted up. 

At first sight it might appear as if the self-energy in Eq. (|3.18() could not be used to 
argue for the breaking of LI, because the shift in the integration variable k^ k' = {k^ — f, k) 
would wipe out / dependence. This, however, is not the case, as we will see. We may carry 
out the dk^ integration, for which we must find the corresponding poles. These are located 
at 



ko = f± Vk^ + m^. (3.19) 

From now on, without loss of generality, we will take / to be positive. The contour 
integral that results from closing the d^k integral of Eq. (|3.18() in the complex plane will 



vanish unless / < because otherwise both poles in Eq. (|3.19|) will lie on the 

same side of the imaginary axis. In light of the Feynman prescription used for the shifting 
of the poles away from the real axis, it would then be possible to close the contour at infinity 
so that there would be no poles in the interior. The pole-shifting prescription, through its 
effect on the dk^ integral, is what introduces an actual / dependence into the expression 
for the self-energy. 

By the Cauchy integral formula, we have 



3 



d{Vk^ + m?- f)- ^7° 



(3.20) 



where the second term in the right-hand side subtracts the contribution from closing the 
contour out at infinity in the complex plane (note the branch cut in the logarithm that 
results from computing that part of the contour integral explicitly). We will introduce the 
cutoff k^ < to make the integral in Eq. (|3.2U|) finite.'^ 



^Carrying out the dfc" integration separately from the spatial integral is legitimate and useful in light of 
the form of Eq. II3.18II . which does not lend itself naturally to Wick rotation. But the use of a non-Lorentz- 
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Figure 3.5: Plots of the left-hand side (in gray) and right-hand side (in black) of equation Eq. 



Define a 



For each plot the parameters are: (a) A = 100, mo = 0, a = 0.001. (b) 



A = 100, mo = 15, a = 0.001. (c) A = 100, mo = 1200, a = 0.001. (d) A = 100, mo = 0, a = 0.002. 
(e) A = 100, mo = 15, a = 0.002. (f) A = 200, mo = 15, a = 0.001. 



Note that the Heaviside step function 9{^/kP' -\- rn^ — f) in Eq. (|3.2U|) is always unity if 
m > /, so that there will be no / dependence at all in Eq. (|3.2fljl unless m < f. Assuming 
that m < f we have 



m = 3 -(/ 



m 



2\3/2 



7° + log (/ + V7^^^) 



-m 



' log (A + vA^TTm^) 



-|- mAvA^ + m^ — mf\/ 



(3.21) 



As before, we use the Schwinger-Dyson equation Eq. ()3.8|1 . and after summing up the 



invariant regulator may cause concern that any breaking of LI we might arrive at could be an artifact of 
our choice of regulator. An alternative is to regulate Eq. (13.2011 dimensionally by replacing di^k with df~^k. 
The resulting equations are more complicated and the dependence on the range of energies where our non- 
renormalizable theory is valid is obscured, but the overall argument does not change. It is also possible to 
multiply the integrand in Eq. 13.181 by a cutoff in Minkowski space 

we get the same result as in Eq. (EHJ. For fe^ > A^ we must impose the condition that /cq > fe^ — A^. 
It should be pointed out that previous work on LI breaking has used 3-momentum cutoffs in computing 
self-energies |56| . although in that case there seems to be a physical interpretation for such a cutoff which 
does not apply to the present discussion. The original work of Nambu and Jona-Lasinio I32| considers cutoffs 
in Euclidean 4-momentum and in 3-momentum, arriving in both cases at similar conclusions. 
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right-hand side as a geometric series, we arrive at the self-consistency condition for our 
ansatz Eq. (|3.15|1 : 



mo — m — f^y^ 



-m 



9 
27r2 



f + VP 



+m log , 

\ A + VA2 + m2 

+mA\/A2 + m2 



-m 



m^ 



(3.22) 



Clearly Eq. (|3.22|) will not admit a non-trivial solution / 7^ unless 5 is positive, which 
agrees with our intuition that the theory must exhibit attraction between particles of the 
same fermion number. The self-consistent condition Eq. (|3.22|) may be separated into two 
simultaneous equations: 



/ 



9 
27r2 



if 



m 



!)3/2 



(3.23) 



and 



mo — m 



gm 
2^ 



m los 



f + Vf 



m^ 



A + VA2 + m2 



+ A\/A2 



-|- m^ 



(3.24) 



It is important to bear in mind that Eqs. 1)3. 23() and (|3.24j) were written under the assump- 
tion that / > m. For / < m the / dependence of the self-energy in Eq. (|3.18|) disappears. 
The trivial, Lorentz-invariant solution / = to the self-consistent equations will always 
be present for any m, as should be the case when spontaneous breaking of a symmetry is 
observed. 



Equation (|3.23|) can be readily solved for / as a function of m (imposing the condition 
that / be real and positive), and the resulting /(m) can be substituted into Eq. (|3.24|1 to 



yield 
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gm 



w? loe 



f{m) + P{m) — m? 
A + + 

+ A-\/ + m? — f{m)\Jf'^{m) — 



(3.25) 



Equation ()3.25|) cannot be solved algebraically, but we may study some of its properties 
graphically. In Fig. 13.51 we have plotted the left-hand side and the right-hand side of Eq. 
(|3.25|) for various values of the parameters g, mo, and A. As plot (a) illustrates, rriQ = 
implies m = 0, i.e., we cannot dynamically generate both a chemical potential and a mass 
term. For m = uiq = we have 

/ = vrv^. (3.26) 

Plot (b) in Fig. 13.51 shows a < rn-o ^ A for which the corresponding m will be 
significantly less than mo. Plot (c) in the same figure illustrates that a very large mo is 
needed before m > mo, but such solutions are not physically meaningful because mo itself 
is already well beyond the energy scale for which our effective theory is supposed to hold. 
By comparing plot (b) to plot (e) we may see the effect of increasing g for a given mo and 
A. A comparison of plots (b) and (f) should illustrate the effect of increasing A with the 
other parameters fixed. 

The plots in Fig. 13.61 illustrate the progression, as the parameter A is increased for 
fixed a, from an unstable theory in which bare masses mo on the order of A are mapped to 
m > A, to a theory that maps such bare masses to m < A. Such an analysis of Eq. (|3.25|) 
reveals that the condition for this mass stability is 

27r2 

< < 1 , (3.27) 

which is reminiscent of the condition Eq. (|3.11|) for chiral symmetry breaking in the NJL 
model (except that now the interaction has the opposite sign). Combining Eq. (|3.27|) with 
Eq. H3.26() (which was exact for mo but may serve approximately for mo small) we arrive 
at the requirement 

< /2 < A^ , (3.28) 
which would surely have to hold if our theory were stable. Indeed, we may interpret Eq. 
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Figure 3.6: Plots of the left-hand side (in gray) and right-hand side (in black) of equation Eq. 

For aU of them a = ^ = 0.01. (a) A = mo = 2. (b) A = mo = 8. (c) A = toq = 12. (d) 
A = Too = 16. 



H3.28() as saying that if we pick physically good parameters (7, mo, and A we will have a 
stable theory with finite chemical potential /. The parameters for plots (a), (b), (d), (e), 
and (f) in Fig. 13 .51 all give examples of such stable theories. As in NJL, the good parameters 
involve (7"^/^ large with respect to A, suggesting that Eq. 1)3. 17() should be a low-energy 
approximation to a non-perturbative interaction of a full renormalizable theory that allows 
attraction between particles of the same fermion number sign. 

The issue of how the form of the self-consistent equations will depend on the choice of 
regulator for the integral in Eq. (|3.18j) is not an entirely straightforward matter. But it 
seems to be a solid conclusion that, for positive fermion self-coupling (7, the solutions to 
such self-consistent equations show the presence of Ll-breaking vacua. In the next section 
of this paper we offer an alternative approach that strengthens this conclusion and that 
sheds further light on the issue of stability. 
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Figure 3.7: Correction of the effective potential of the auxihary field from integrating out the 
fermion. The first graph does not contribute by the Ward identity, while the second vanishes by 
Furry's theorem. 

3.4 Consequences for emergent photons 

The theory 

£ = -mo)V' + |(^7''V')^ (3.29) 

is equivalent to 

C' = i,{i^-4-mo)'^-—. (3.30) 

2^ 

Since we argued that Eq. (|3.29jl may spontaneously break LI by giving a finite (ijj'y'^'tp) , 

we conclude that A^^ in Eq. 1)3. 30() would also have a finite VEV, since, by the algebraic 
equation of motion, 

Ai" = -gi^-i^"^. (3.31) 

This interpretation agrees with the observation that Eq. 1)3. 30(1 has a vector boson field 
whose mass term carries the wrong sign if > 0, indicating that the zero-field state is not 
a good vacuum. To find the correct vacuum for the theory we must carry out the path 
integral over the fermion field to obtain the effective action T[A\, and then minimize that 
quantity. Figure EIZI shows the radiative corrections to T[A\ as a perturbative series, in terms 
of Feynman diagrams. The field A^ is minimally coupled to V) so that the computation 
should proceed as in QED. By the Ward identity we do not expect a correction to the mass 
term for A^, as long as an adequate regulator is used. But we do expect to get terms in the 
effective action that go as A'^ and higher even powers of the auxiliary field. 

Since we have reason to believe that QED is stable for any value of the charge e, it 
therefore seems logical to expect that the effective action for A^^ in Eq. (|3.3U|) gives it a 
finite time- like VEV, which would imply a finite VEV for ipj^ip in the theory of Eq. 1)3. 29() . 
We argued in the previous section that g must be large for the theory described by Eq. 
(|3:29|) to be stable. This too seems natural in light of Eq. (|3.30|) . because a large g makes 
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Figure 3.8: Radiative corrections make the effective potential r[A] stable and give A'^ a non-zero 



the A"^ term small, so that the instability created by it may be easily controlled by the 
interaction with the fermions, yielding a VEV for A^ that lies within the energy range of 
the effective theory. Figure \'A.H\ schematically represents how the radiative corrections to 
the effective action give a finite VEV for A^^. 

Armed with Eq. (|3.30() it would seem possible to carry out the program proposed by 
Bjorken, and by Kraus and Tomboulis, in order to arrive at an approximation of QED in 
which the photons are composite Goldstone bosons. It is conceivable that a complicated 
theory of self-interacting fermions, perhaps one with non-standard kinetic terms, might sim- 
ilarly yield a VEV for V'|(7^ du)ip, allowing the project of dynamically generating 
linearized gravity to go forward. 



It would have been more encouraging if we had been able to obtain a non-zero (ip^^il)^ 

through a more natural mechanism than invoking an imaginary charge. Non-abelian gauge 
theories (such as QCD) exhibit attraction between particles of the same fermion number 
(and, like abelian theories with imaginary charge, they exhibit anti-screening). So far, 
however, attempts to find a non-abelian gauge theory with non-zero (^'tp^f^Tp'j have failed, 
possibly because in such theories the attraction between fermion and antifermion is stronger 
than the attraction between fermions (see, for instance, |43j ) . 



VEV. 
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Chapter 4 

Phenomenology of spontaneous 
Lorentz violation 

What lies behind the Principle of Relativity? This is a philosophical 
question, not a scientific one. You will have your own opinion; here is ours. 
We think the Principle of Relativity as used in special relativity rests on 
one word: emptiness. Space is empty. 

— Edwin F. Taylor and John A. Wheeler, Spacetime Physics, Chap. 3 

4.1 Introduction 

Lorentz invariance (LI), the fundamental symmetry of Einstein's special relativity, states 
that physical results should not change after an experiment has been boosted or rotated. 
In recent years, and particularly since the publication of work on the possibility of sponta- 
neously breaking LI in bosonic string field theory (^5)' there has been considerable interest 
in the prospect of violating LI. More recent motivations for work on Lorentz non-invariance 
have ranged from the explicit breaking of LI in the non-commutative geometries that some 
have proposed as descriptions of physical space-time (see and references therein), and 
in certain supersymmetric theories considered by the string community ( |46l I47j ) . to the 
possibility of explaining puzzling cosmic ray measurements by invoking small departures 
from LI (^y) or modifications to special relativity itself f |49l I5UI It has also been 

suggested that anomalies in certain chiral gauge theories may be traded for violations of 
LI and CPT f j52j^. Extensions of the standard model have been proposed that are meant 
to capture the low-energy effects of whatever new high-energy physics (string theory, non- 
commutative geometry, loop quantum gravity, etc.) might be introducing violations of LI 
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Our own investigation of composite massless mediators in Chapters [21 and led us to 
consider the question of how a reasonable QFT might spontaneously break LI through a 
timelike Lorentz vector VEV {tp^'^ip) ^ 0. This breaking of LI can be thought of conceptu- 
ally as the introduction of a preferred frame: the rest frame of the fermion number density. 
If some kind of gauge coupling were added to the theory without destroying this LI breaking, 
the fermion number density would also be a charge density, and the preferred frame would 
be the rest frame of a charged background in which all processes are taking place. This 
allows us to make some very general remarks in Section r4.2l on the resulting Ll-violating phe- 
nomenology for electrodynamics and on experimental limits to our non-Lorentz-invariant 
VEV. This discussion will be based on work previously published in |24j . 

Experimental data put very tight constraints on Lorentz violating operators that involve 
Standard Model particles but the bounds are more model-independent on Lorentz vio- 
lation that appears only in couplings to gravity [67LI68j . One broad class of Lorentz-breaking 
gravitational theories are the so-called vector-tensor theories in which the space-time met- 
ric g^^^ is coupled to a vector field that does not vanish in the vacuum. Consideration 
of such theories dates back to [55] and their potentially observable consequences are ex- 
tensively discussed in [211 • These theories have an unconstrained vector field coupled to 
gravity. Theories with a unit constraint on the vector field were proposed as a means of 
alleviating the difficulties that plagued the original unconstrained theories f|71jl. 

The phenomenology of these theories with the unit constraint has been recently explored. 
It has been proposed as a toy model for modifying dispersion relations at high energy (j72j). 
The spectrum of long- wavelength excitations is discussed in ^3], where it was found that 
all polarizations have a relativistic dispersion relation, but travel with different velocities. 
Applications of these theories to cosmology have been considered in |741 175j . Constraints 
on these theories are weak, as for instance, there are no corrections to the Post-Newtonian 
parameters 7 and (3 ([ZS])- The status of this class of theories, also known as "sther- 
theories," is reviewed in |77j . 

In Section 14.41 we will show that the general low-energy effective action at the two- 
derivative level of the Goldstones of spontaneous Lorentz violation by a timelike vector 
VEV minimally coupled to gravity corresponds to the vector-tensor theory of gravity with 
the unit constraint. This will allow us to place observational constraints of very general 
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validity on this kind of Lorentz violation, from solar system tests of gravity. This discussion 
will be based on work previously published in [S^. Finally, in Section [4.51 we shall discuss 
the physical meaning of this kind of Lorentz violation and its relation to some other models 
that have appeared recently in the literature. 

4.2 Phenomenology of Lorentz violation by a background 
source 

Following up on the idea presented in Chapter |21 imagine that the fermions of the universe 
have some interaction that plays the role of Eq. (|3.17j) in giving a VEV to ipj'^tp, and that 
in addition they have a U{1) gauge coupling (at this stage we have abandoned the project 
of producing composite photons). Then the U{1) gauge field may interact with a charged 
background and we would be breaking LI in electrodynamics by introducing a preferred 
frame: the rest frame of the background source. 

The possibility of a vacuum that breaks LI and has non-trivial optical properties has 
already been investigated in [551 156j . This work, however, deals with significantly more 
complicated models, both in terms of the interactions that spontaneously break LI and of 
the optical properties of the resulting vacuum. To obtain a phenomenology for our own 
simpler proposal, we consider a free photon Lagrangian of the form 

Ct°''"' = -\F',.-J,A^ , (4.1) 

where = e{'ip'y^'ip) , thought of as an external source. The corresponding propagator for 
the free photon is 

{T{A^{x)A''{y)}) = D^/ix - y) + (A^(x)),- {A'^iy))j , (4.2) 

where D^^ [x — y) is the connected photon propagator and {A^^{x))j is the expectation value 
of A^ in the presence of the external source. 

If we take constant and naively attempt to calculate the classical expectation value of 
A^^ in the presence of a constant source by integrating the Green function for electrodynam- 
ics, we will get a volume divergence. We may attempt to regulate this volume divergence 
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by introducing a photon mass //, which gives the resuh 



{A^{x)), = ^. 



(4.3) 



(It is trivial to check that this is a solution to EUA^' — fi'^A^ = —j^, the wave equation for 
the massive photon field with a source.) This is not satisfactory because the disconnected 
term in Eq. (|4.2|1 will be proportional to fi~^ and Feynman diagrams computed with our 
modified photon propagator would produce results that depend strongly on what we took 
for a regulator. In fact the mass is physical and analogous to the effective photon mass 
first described by the London brothers in their theory of the electromagnetic behavior of 
superconductors 28 . (Using the language of particle physics we may say that, in the 
presence of a U{1) gauge field, the VEV (ip^^Tp) spontaneously breaks the gauge invariance 
and gives a mass to the boson, as in the Higgs mechanism.) 

Photons in a superconductor propagate through a constant electromagnetic source. In 
a simplified picture, we may think of it as a current density set up by the motion of charge 
carriers of mass m and charge e, moving with a velocity u. The proper charge density is 
Pq. The proper velocity of the charge carriers is t]'^ = (1,m)/\/1 — n^. The source is then 
= pqT]^ = pqp^ /m, where is the classical energy momentum of the charge carriers. We 
may think of m and po as deriving from the solutions to the parameters in a self-consistent 
equation such as we had in Eq. 1)3. 25() . 

The canonical energy momentum of the system is = mri'^ + eA'^ = mj^/po + eA^. 
As is discussed in the superconductivity literature (see, for instance, Chap. 8 in [SZl), the 
superconducting state has zero canonical energy momentum, which leads to the London 
equation 



With this inserted into the right-hand side of \Z\A^ = —j^ (the wave equation for the 
photon field in the Lorenz gauge), we find that we have a solution to the wave equation of 
a massive A^^ with no source and a mass = epo/m: 




(4.4) 



\JA^' - ^A" = 0. 
m 



(4.5) 
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If we solve for A^^ in Eq. H4.4|) and substitute this back into Eq. 1)4. 2j) . we get that 

2 

Tn 

{T{A>^{x)A-^{y)}) = D^/{x - y) + -^ff. (4.6) 

Notice that if is not constant, then Fourier transformation of the second term in Eq. 
H4.6|) win not yield, in Feynman diagram vertices, the usual energy-momentum conserving 
delta function. Therefore, presumed small violations of energy or momentum conservation 
in electromagnetic processes could conceivably be parametrized by the space-time variation 
of the background source.^ 

With Eq. (|4.6|) and a rule for external massive photon legs, one may then go ahead and 
calculate the amplitude for various electromagnetic processes with this modified photon 
propagator, and parametrize supposed observed violations of LI (see [2310011^) by j^. If 
we can make an estimate of the size of the mass m of the background charges, experimental 
limits on the photon mass (< 2 x 10~^^ eV according to [HI]) will provide a limit on the 
VEV of ^57/^^, in hght of Eq. (lOl) . 

There are other consequences of a VEV {'i/jj^iIj) ^ on which we may speculate. Such a 
background may have cosmological effects, a line of thought that might connect, for instance, 
with pF^. Also, it is conceivable that such a VEV might have some relation to the problem of 
baryogenesis, since it gives the background finite fermion number and spontaneously breaks 
CPT, a violation that can ease the Sakharov condition of thermodynamical non-equilibrium 

[Ml ESI- 

4.3 Effective action for the Goldstone bosons of spontaneous 
Lorentz violation 

Here we begin by considering the general low-energy effective action for a theory in which 
Lorentz invariance is spontaneously broken by the VEV of a Lorentz four- vector S'^. With 
an appropriate rescaling, the VEV satisfies 



{S^S^) = 1 , (4.7) 

This line of thought could connect to work on LI violation from variable couplings as discussed in |58|. 
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since we assume the VEV of is time-like. The existence of this VEV imphes that there 
exists a universal rest frame (which we sometimes refer to as the preferred frame) in which 

= 6q. When the resulting low-energy effective action is minimally coupled to gravity, 
we shall see that it simply becomes the vector-tensor theory with the unit constraint. 

Objects of mass Mi and M2 in a system moving relative to the preferred-frame can 
experience a modification to Newton's law of gravity of the form (|7n| I78j) 



where w is the velocity of the system under consideration, such as the solar-system or 
Milky Way galaxy, relative to the universal rest frame. The main purpose of this note is to 
compute 02 in theories where Lorentz invariance is spontaneously broken by the VEV of a 
four- vector. 

The VEV of S^^ spontaneously breaks Lorentz invariance. But as rotational invariance is 
preserved in the preferred frame, only the three boost generators of the Lorentz symmetry 
are spontaneously broken. The low-energy fluctuations S'^{x) which preserve Eq. (|4.7() are 
the Goldstone bosons of this breaking, i.e., those that satisfy 



In the preferred-frame the fluctuations can be parameterized as a local Lorentz transforma- 
tion 



where cf) is as vector with components (j)^, cf^, and (p^. 

Under Lorentz transformations S^{x) A(^5''^(x) and the symmetry is realized non- 
linear ly on the fields Using this field S^{x) we may then couple the Goldstone bosons 
to Standard Model fields. Since however, the constraints on Lorentz- violating operators ^ 
involving Standard Model fields are considerable we instead focus on their couplings 
to gravity, which are more model-independent because they are always present once the 
Goldstone bosons are made dynamical. 

^More correctly, operators that appear to be Lorentz violating when the Goldstone bosons (j)' are set to 
zero. 




(4.8) 



5^(x)5^(x) = 1 . 



(4.9) 




(4.10) 
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The Goldstone bosons are made dynamical by adding in kinetic terms for them. Since 
Lorentz invariance is only broken spontaneously, the action for the kinetic terms should 
still be invariant under Lorentz transformations. The only interactions relevant at the 
two-derivative level and not eliminated by the constraint Eq. 1)4. 9|) are^ 

£ = cidaSf^d^Sfs + (C2 + C3)^f,S^'^uS'' + c^S^d^S'^S'^d^Sa . (4.11) 

Expanding this action to quadratic order in 0*, one finds that the four parameters Cj can 
be chosen to avoid the appearance of any ghosts. In particular, we require ci + C4 < 0.'* 

To leading order, the effective action for the Goldstone bosons is: 



^ = ^ E [{d,^f-»{Mf] (4.12 



j=l,2,3 

where a = (c2 + C3)/ci. By inserting a plane wave ansatz, (jfix^) oc exp {iujx^ — ikx^^, we 
see that we have 2 transverse waves, (p^ and 0^, with speed v = to/k = 1, and one longitu- 
dinal wave, (j)^, with v = y/1 + a. Since we've broken LI, massless particles no longer need 
to travel at light speed. For a > 0, the longitudinal Goldstone boson is superluminal. We 
shall return to the issue of superluminality in Section 14.51 

This agrees with the result, discussed in [HHI and in Chapter |31 that spontaneous Lorentz 
violation gives us not only two transverse Goldstone bosons (which we could identify as 
emergent photons) but also an extra polarization with an unusual dispersion relation. In 
j3Uj . where the Lorentz-breaking VEV was imagined to be spacelike, that extra polarization 
was timelike. In our case it is a longitudinal polarization because the VEV in Eq. 1)4. 7(1 was 
chosen to be timelike. 



4.4 The long-range gravitational preferred-frame effect 

With gravity present the situation is more subtle. One expects the gravitons to "eat" 
the Goldstone bosons, producing a more complicated spectrum |791 l8Uj . The covariant 

■^The other possible term, e^'^'"' df^S^dpScr, is a total derivative. 

^Notice that in our convention is dimensionless and the Ci's have mass dimension two. 
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generalization of the constraint equation becomes 



g^,{x)S^{x)S''{x) = 1 (4.13) 

and in the action for S^^ we replace 9^ — > V^. 

Note that there is no Higgs mechanism to give the graviton a mass. For a gauge theory 
we have the covariant derivative D^^ = d^ — ieA^^, so that (D^cj))'^ gives a term proportional 
i.e., a gauge boson mass, when (0) ^ 0. For in the case of gravity coupled to a vector 
field we have 

V^S-' = d^S-' + V'^^SP , (4.14) 

with 

r''p^ = l{dpK + d,h';,-d''hp,) (4.15) 

so that there is no way to get a term proportional to S'^h'^ . 

Compare this the ghost condensate mechanism described in [SJ, where C = P{X) for 
X = gfj,^d'^(pd'^(l). If we assume that 

P'iX = / 0) = , (4.16) 



then, in the preferred frame, this implies that 

{X) =cl = (f) / (4.17) 

and the term in P{X) gives a graviton mass ^^/loo- This is different from our case, 
where we get five massless graviton polarizations with different propagation velocities. 

Going back to our model, we see that local diffeomorphisms can be used to gauge 
away the three Goldstone bosons. For under a local diffeomorphism (which preserves the 
constraint Eq. ()4.13|) ). 

S"{^') = J^S'^(^) (4-18) 
and with x'l" = x^' + e^, S^" = v^' + 4>^', 



(l)'f'{x') = (^f'{x)+vPdpe'' 



(4.19) 
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from which we can determine to completely remove (j)^. Note that in the preferred frame, 
e* can be used to remove cp^. In this gauge, the constraint Eq. (|4.13|1 reduces to 

= (1 - hoo{x)/2) . (4.20) 

The residual gauge invariance left in can be used to remove /iqo- This is an inconvenient 
choice when the sources are static. In a more general frame with {S'^) = v^^, obtained by a 
uniform Lorentz boost from the preferred frame, the constraint Eq. 1)4. 13(1 is solved by 

S^'{x) = v^'{l- vPv''hi„{x)/2) . (4.21) 

Next we discuss a toy model that provides an example of a more complete theory, that 
at low energies reduces to the theory described above with the vector field satisfying a unit 
covariant constraint (|4.13j) .^ Consider the following non-gauge- invariant theory for a vector 
boson A^^, 

C = -^g^.gP^VpA'^V.A^ + X {g^^A'^A'' - v^f . (4.22) 
Fluctuations about the minimum are given by 

g^,v = r]^,u + V > A^' = v^ + i^^ . (4.23) 

This theory has one massive state # with mass oc X^f^v, which is 

$ = v^'i)^, + Vi;^i;72 . (4.24) 

In the limit that X ^ oo this state decouples from the remaining massless states. In the 
preferred frame the only massless states are h^y , and V'* ■ Since we have decoupled the heavy 
state, we should expand 

= v+[iP^ + 1-/100/2] - vhQQ/2 -^v- vhoQ/2 , (4.25) 

where in the last limit we have decoupled the heavy state. Note that this parameterization 
of A^ is precisely the same parameterization that we had above for . In other words, 

''For a related example, see |80| . 
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in the limit that we decouple the only heavy state in this model, the field satisfies 
g^yA^A'^ = v^, which is the same as the constraint (|4.13jl with A^^ — > vS^. 

In the unitary gauge with (jf = 0, the only massless degrees of freedom are the gravitons. 
There are the two helicity modes, which in the Lorentz-invariant limit correspond to the 
two spin-2 gravitons, along with three more helicities that are the Goldstone bosons, for a 
total of five. The sixth would-be helicity mode is gauged away by the remaining residual 
gauge invar iance. 

But the model that we started from does have a ghost, since we wrote a kinetic term 
for A^^ that does not correspond to the conventional Maxwell kinetic action. The ghost in 
the theory is A^ , which in our case is massive. The presence of this ghost means that this 
field theory model is not a good high-energy completion for the low-energy theory involving 
only and gravity that we are considering in this section. We assume that a sensible high 
energy completion exists for generic values of the Cj's. 

Now we proceed to compute the preferred-frame coefficient a2 appearing in the modifi- 
cation to Newton's law. 

The action we consider is 

S = j <fx^ (£eh + C^f + £gf ) , (4.26) 

with*5 

£e„ = -^R (4.27) 

and 



Cv = ciV^S'^V'^Sp + C2V^S''V^S'' + csVf.S^V^S'' + c^S^V ^S'^S'^V^S^ . (4.28) 

This is the most general action involving two derivatives acting on that contributes to the 
two-point function. Note that a coefficient C3 appears, since in curved space-time covariant 
derivatives do not commute. Other terms involving two derivatives acting on may be 
added to the action, but they are either equivalent to a combination of the operators already 
present (such as adding R^^^S'^S'^), or they vanish because of the constraint Eq. ()4.13|) . We 

^The coefficients Ci appearing here are related to those appearing in, for example |73|. by c^'^''*' = 
-c*'^"7l67rG. 
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assume generic values for the coefficients q that in the low energy effective theory give no 
ghosts or gradient instabilities. 

As previously discussed, satisfies the constraint ()4.13|) . We also assume that it does 
not directly couple to Standard Model fields. In the literature, Eq. (|4.1l'{j) is enforced by 
introducing a Lagrange multiplier into the action. Here we enforce the constraint by directly 
solving for S^^, as given by Eq. (|4.2H) . and then insert that solution back into the action to 
obtain an effective action for the metric. 

In our approach there is a residual gauge invariance that in the preferred-frame corre- 
sponds to reparameterizations involving only. To completely fix the gauge we add the 
gauge-fixing term 

^gf = -f (S'S'^S'^d^hp^f . (4.29) 
Neglecting interaction terms, in the preferred frame the gauge-fixing term reduces to 

= -| [dohoof ■ (4.30) 

Physically, this corresponds in the a ^ oo limit to removing all time dependence in hoo 
without removing the static part, which is the gravitational potential. This is a convenient 
gauge in which to compute when the sources are static. 

At the two-derivative level, the only effect in this gauge of the new operators is to modify 
the kinetic terms for the graviton. The dispersion relation for the five helicities will be of 
the form E = P\k\, where the velocities P are not the same for all helicities and depend on 
the parameters Cj ([ZS|)- This spectrum is different than that which is found in the "ghost 
condensate" theory, where in addition to the two massless graviton helicities, there exists a 
massless scalar degree of freedom with a non-relativistic dispersion relation E cc (.ST). 

There exists a range for the q's in which the theory has no ghosts and no gradient 
instabilities (|23). In particular, for small q's, no gradient instabilities appear if 

^1±^^>0 and ^^>0. (4.31) 

Cl + C4 Cl + C4 

The condition for having no ghosts is simply ci + C4 < 0. 

The correction to Newton's law in Eq. (|4.8|) is linear order in the source. Thus to 
determine its size we only need to find the graviton propagator, since the non-linearity of 



63 

gravity contributes at higher order in the source. In order to compute that term we have 
to specify a coordinate system, of which there are two natural choices. In the universal 
rest frame, the sources, such as the solar system or Milky Way galaxy, will be moving and 
the computation is difficult. We instead choose to compute in the rest frame of the source, 
which is moving at a speed |tt)| ^ 1 relative to the universal rest frame. Observers in 
that frame will observe the Lorentz breaking VEV v^^ ~ (1, —w). In the rest frame of the 
source, a modified gravitational potential will be generated. Technically this is because 
terms in the graviton propagator v ■ k w ■ k are non-vanishing. It is natural to assume 
that dynamical effects align the universal rest frame where -y'* = Sq with the rest frame of 
the cosmic microwave background. 

In a general coordinate system moving at a constant speed with respect to the universal 
frame the Lorentz-breaking VEV will be a general time-like vector v^. Thus we need 
to determine the graviton propagator for a general time- like constant v'^. Since Lorentz 
invariance is spontaneously broken, the numerator of the graviton propagator is the most 
general tensor constructed out of the vectors v'^, k" and the tensor r]'^'^ . There are 14 such 
tensors. Writing the action for the gravitons as 

S=^Jd''k h^f{-k)K^p\,p{k)h'^''{k) (4.32) 

it is a straightforward exercise to determine the graviton propagator V by solving 

K^p\,u{k)r^''^'"' {k) = \ (^ry^ + nl^'^ ■ (4.33) 

The above set of conditions leads to 21 linear equations that determine the 14 coefficients 
of the graviton propagator in terms of the coefficients Cj and the VEV . Seven equations 
are redundant and provide a non-trivial consistency check on our calculation. 

Although it is necessary to compute all 14 coefficients in order to invert the propagator, 
here we present only those that modify Newton's law as described previously (assuming 
stress-tensors are conserved for sources). These are 

^N^wton = {Ar7°^r?^^ + B(r/"V + r/"V'') + C(i;"A'"' + «''«V) 
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We find that each of these coefficients is independent of the gauge parameter a. We also 
numerically checked that without the presence of the gauge-fixing term the propagator could 
not be inverted. 

To compute the preferred- frame effect coefficient 02, we only need to focus on terms in 
the momentum-space propagator proportional to (f • /c)^. To leading non-trivial order in 
G{v ■ k)"^ and in the Cj's we obtain, from the linear combination A + 2B + 2C + D + 4E, 

/d^k \ { (v ■ k)"^ 1 

(2^F 1^ - ^"^^ k^ c,(c, + C2 + C3) + '^^(^^ + 

+C?(3C2 + 5C3 + 3C4) + Ci((6C3 - C4)(C3 + C4) + €2(603 + cM \ f'^^k) , (4.35) 



where in the first line /c is a four-vector. Next we use = (1, —w), place the source at the 
origin, substitute r°o = M5^^^ (x) or T^^{k) = 2iTM5{k^) and use 



Sk kjkj ,f^,^ _ 1 
(27r)3 fc4 g^j. 



Xj 



(4.36) 



to obtain 



ffOO = 1-2Gn— 1- ^ 2 o ^ ^ \ , 2cf + 4C^ C2 + C3 + 

r V 2ci(ci -h C2 + C3) '- 

+cf(3C2 + 5C3+3C4)+Ci((6C3-C4)(c3 + C4)+C2(6C3 + C4))]^ , (4.37) 

where we have only written those terms that give a correction to Newton's law proportional 
to [w ■ r/r^. We have also assumed that \w\ <^ 1 so that higher powers in w ■ r /r can 
be neglected. The factor of 1/ci in the preferred-frame correction to the metric arises 
because when ci ^ the "transverse" components of 0* have no spatial gradient kinetic 
term. Similarly, the factor of l/(ci + C2 + C3) arises because when ci -|- C2 + C3 — > the 
"longitudinal" component of has no spatial gradient kinetic term. Either of these cases 
causes a divergence in the static limit. ^ 

The coefficients q redefine Newton's constant measured in solar system experiments and 
we find that 

G« = G[l-8,G(c,+c,)lc..^^^-|-^, (4.38) 



^This divergence can of course be avoided by considering higher-derivative terms in the action for the 
Goldstone bosons. This would then give non-relativistic dispersion relations for these modes, E oc \k\" for 
n > 1, as was the case in |[5T| . 
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which agrees with previous computations to hnear order in the q's after correcting for the 
differences in notation [7^ [77] . 

The experimental bounds on deviations from Einstein gravity in the presence of a source 
are usuahy expressed as constraints on the metric perturbation. Since the metric is not 
gauge-invariant, these bounds are meaningful only once a gauge is specified. In the litera- 
ture, the bounds are typically quoted in harmonic gauge. Here, the preferred-frame effect is 
a particular term appearing in the solution for /loo- For static sources, the gauge transfor- 
mation needed to translate the solution in our gauge to the harmonic gauge is itself static. 
But since a static gauge transformation cannot change /iqo, we may read off the coefficient 
of the preferred-frame effect in the gauge that we used. 

By inspection 



which can be compared with the experimental bound \a2\ < 4x 10^'' given in [2H1- After |54j 
was published, Foster and Jacobson (\S2\) carried out the full computation of a2 in terms of 
the Cj parameters in the vector-tensor theory with the unit constraint and confirmed that 
Eq. H4.c{9|) is correct to leading non-trivial order. 

The experimental bound on a2 is obtained by considering the torque that the effect in 
Eq. (|4.8j) would exert on the plane of the orbit of a planet. For simplicity, let us consider a 
circular planetary orbit of radius r, moving around the sun, whose velocity w with respect 
to the preferred frame we take to be aligned with the z-axis, as shown in Fig. 14.11 The 
average torque over one orbit is 



where Oq is the inclination between the plane of planet's orbit and the axis of w. 

This torque would cause the planes or the planets in the solar system to precess at 
different rates, unless all the orbital planes were perfectly aligned or anti-aligned with the 
axis w. If we consider, for instance, the orbits of Earth and Mercury, whose planes are 
aligned to within a few degrees, and then consider Eq. (|4.4fljl with 




(4.39) 



, a2GNMiM2w'^ 



sin 26*0 



(4.40) 



T = —X 



4r 
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Figure 4.1: Diagram of a planet moving in a circular orbit of radius r around the sun (located at 
the origin), whose velocity with respect to the preferred frame is w. The inclination between the 
plane of the orbit and the axis of w is 60 (the minimum value of the polar angle 9 during the planet's 
orbit). 

• Ml = solar mass 

• \w\ ~ 10^^ (the sun's speed with respect to the CMB rest frame) 

• sin 26*0 ~ 0(1) 

then the fact that Mercury and the Earth have maintained their approximate alignment 
over the age of the solar system (~ 4.5 x 10^ years) gives us, roughly, the bound in the 
literature of \a2\ < 10^"^. 

A considerably stronger constraint on the size of the Cj's can be derived from the fact 
that a particle moving faster than one of the graviton polarizations would lose energy 
through gravitational Cerenkov radiation. In particular, this gravitational Cerenkov radi- 
ation would limit the flux of the highest-energy cosmic rays (which are protons moving at 
nearly the speed of light). Depending on the exact assumptions regarding the abundance 
and distribution of cosmic ray sources, the resulting bound can range from G|ci| < 10"^^ to 
G\ci\ < 10^^^ (inZl)- These limits, however, apply only if the extra graviton polarizations 
propagate subluminally. We will have more to say on this issue in Section 14.51 
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Figure 4.2: Feynman diagram for tlie (negligible) modification to gravity by the coupling of the 
graviton to acoustic perturbations in the CMB. 

4.5 A cosmic solid 

We know that the effect considered in Section r4.41 the modification of gravity by the presence 
of a background with a rest frame, is present in nature, because the electromagnetic 
radiation in the CMB has a conserved Poynting 4-vector: 



This background modifies gravity because gravitons can couple to acoustic pertur- 
bations in it, as shown in Fig. 14.21 This effect is, however, completely negligible, since 
the characteristic energy scale of the CMB is Tcmb ~ 2.7 K, which means that this effect is 
suppressed by a factor of 



The question remains, however, whether there might be some other background that, unlike 
the CMB, couples strongly to gravity (and only to gravity, so as to explain why it has not 
been otherwise detected). The Goldstone bosons of spontaneous Lorentz violation would 
correspond to the sound waves in this background, and the modification to gravity comes, 
as it did in Fig. 14.21 from the mixing of the gravitons with these acoustic modes. 




1 



(4.41) 




(4.42) 



In [73], the authors find the propagation velocities of the five graviton polarizations in 
vector-tensor theories with the unit constraint. In our language, these are the velocities of 
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the two usual gravitons plus the three acoustic modes in the Lorentz-violating background: 



2 transverse traceless metric = 1/(1 — C13) — > 1 



2 transverse Goldstones 



(Cl - C\I2 + ci/2)/(ci4)(l - C13) ^ Cl/(C14) 



(4.43) 



1 longitudinal Goldstone 



Igt — 



Cl23(2 - Ci4)/Ci4(l - Ci3)(2 + C13 + 802) 
^ C123/C14 , 



where Cj...fc = G(cj + . . . + c^) and where the limits correspond to vanishing q's. Since, for 
general q's, there are two distinct sound speeds, one for the longitudinal and one for the 
transverse modes, our Lorentz-violating background fulfills the canonical definition of a 
solid. ^ The transverse sound speed is associated with a shear mode (a deformation which 
alters the shape but not the volume of a body). Linear shear waves are absent in a fluid 
(see, for instance. Chapters III and VI in 83 ). 

In Section 14.41 we emphasized the difference between our model, which we may now 
refer to as the "cosmic solid" model, and the "ghost condensate" of ^STJ. In pT], Lorentz 
invariance is broken by a VEV for a spin-0 vector field = d^(j) with a single degree of 
freedom, whereas in the cosmic solid model the Lorentz invariance is broken by a spin- 
1 vector field with three degrees of freedom. Therefore the ghost condensate has a 
single Goldstone boson, with non-relativistic dispersion relations E oc and it gives the 
graviton a mass when minimally coupled to it, whereas the cosmic solid has three Goldstone 
bosons, with dispersion relations -E oc |fc|, and it does not give the graviton a mass. It turns 
out that if the ghost condensate is gauged (i.e., if the ghost condensate field (j) is minimally 
coupled to a U{1) gauge field ^4^^), then the two polarizations of the gauge field provide 
the two extra degrees of freedom, and the resulting model is equivalent to the cosmic solid 
f|84j). Whether the ghost condensate itself admits a high-energy completion is unresolved 

(see iHniiHn]). 

It can be seen from Eq. (|4.43jl that the speeds of the Goldstone bosons can be made su- 
perluminal without introducing ghosts or other obvious problems in the low-energy effective 
action. As pointed out in Section if the Goldstones are required to be subluminal, then 
*Tliis was brought to my attention by Juan Maldacena and Ian Low. 
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Q!2 no longer gives the strongest constraint on the size of the Cj's because a far more strin- 
gent hmit apphes, from the gravitational Cerenkov radiation of the highest-energy cosmic 
rays. Super luminal Goldstone bosons would evade that constraint. Whether super luminal- 

ity could result from a reasonable high-energy completion, and whether the initial value 
problem in the low-energy effective action is well-posed in the presence of superluminal 
modes, remain open questions. 
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Chapter 5 

Some considerations on the 
cosmological constant problem 

5.1 Introduction 

Consider Einstein-Hilbert gravity as an effective theory, containing all the terms compatible 
with its symmetries: 

S = j d^x./^[C^^uer{(^.g^u)-2K + MlR + ---] , (5.1) 

where Mpi = s/i/SttG and 

The equation of motion for the 

R^u - ^g^vR - Ag^u = SttGT^, . (5.3) 

We would naturally expect that 

A - M^, ~ (1028 eV)^ . (5.4) 



= " . (5.2) 



metric g'^'^ is 



If we let 



(5.5) 
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Figure 5.1: Feynman diagram of tlie coupling of a single graviton to the cosmological constant A in 
Eq. 1)5. The blob may also be thought of as a collection of vacuum-to- vacuum quantum processes. 

where h^'^ is the graviton field, then the A term in Eq. 1)5.11) gives 



The first term in the right-hand side of Eq. ()5.6|) is an irrelevant constant, but the second 
gives a tadpole diagram for the graviton, as shown in Fig. 15.11 By Eq. ()5.4p we would 
therefore expect this tadpole interaction to be of order M|;. 

Alternatively, one can think of this tadpole diagram, shown in Fig. 15.11 as the coupling 
of a single graviton to the quantum-mechanical vacuum energy. This corresponds to moving 
the A(7^jy in Eq. (|5.3() from the left-hand to the right-hand side and thinking of it as the 
contribution to the matter T^^, from the quantum-mechanical vacuum energy. In quantum 
field theory, each frequency mode of the free field is a simple harmonic oscillator. Therefore, 
each mode has a zero-point energy E = lo/2. We clearly have to cut off the sum at some 
scale, but the successes of quantum field theory so far suggest the cut off scale can't be 
much smaller than ~ 1 TeV. 

In any case, we get a positive value of A (the "cosmological constant" ) far, far in excess of 
what observation allows. To see qualitatively the effect of large positive A, imagine vacuum 
energy inside a piston. Its energy density, /), is constant. If the piston is pulled out, as 
shown in Fig. 15.21 the total energy must increase: dE = pdV . By energy conservation, we 
must have supplied that energy when we pulled on the piston: dW = Fdi = pdV = —dE. 
Therefore the piston would resist being pulled out: Pressure is negative, p = —p. 

The Newtonian limit of GR for a test mass on the edge of a uniform sphere of radius tq 
gives an acceleration 



Therefore, the quantum vacuum energy would anti-gravitate. A value of A as large as 




(5.6) 



(5.7) 
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Figure 5.2: Consider a piston filled with vacuum energy, whose density is constant. By energy con- 
servation, the piston must resist being pulled out, and therefore the vacuum energy exerts negative 
pressure. 

what we would expect on effective theory or quantum mechanical grounds would rip apart 
the universe, preventing it from developing any structure. It was long presumed that some 
unknown symmetry of quantum gravity would forbid the A term in Eq. (|5.1|) , thus naturally 
making the cosmological constant zero. In Chapter |31 we discussed one such idea: that the 
graviton was a Goldstone boson of spontaneous Lorentz violation, so that the broken Lorentz 
invariance protected it from getting any potential at all. 

Data on the accelerated expansion of the universe, however, has recently shown that 
there is a small but non-zero anti-gravitating term. |H7ll88j Two possible approaches to this 
cosmological constant problem that will be of interest to us here are: 

• to imagine that the true A is zero, but that the universe contains some other field, 
coupled only to gravity, which accounts for the accelerated expansion. 

• to imagine that the value of A varies over some landscape of possible universes, and 
that we naturally happen to live in one where A is small enough that structure (and 
therefore intelligent life) may form. 

The first line of thought will lead us in Section [5. 21 to consider whether a cosmological scalar 
field can have a pressure more negative than —p. In Section [5. 31 we will consider how the 
ghost condensate of j81j would behave if it were responsible for the accelerated expansion 
of the universe. In Section [5.41 we will re-examine the second line of thought in light of the 
proposal that other parameters besides A vary over the landscape of possible universes. 
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5.2 Gradient instability for scalar models of the dark energy 
with w < —1 

Matter whose equation of state satisfies w = p/p < —1 violates a number of conditions, 
including the weak energy condition, generally assumed to apply to any reasonable model 
of physics [80,. However, the observational data do not exclude the possibility that the dark 
energy has w < —1 ([^011^]). The results reported in [Hlj indicate that —1.26 < w < —0.83 
at 95% confidence level. The possibility w < —1 has been explored by numerous authors 
(see, for example, |93j- jlUt^ ). These models often contain a field with an unusual kinetic 
term, which is referred to as a phantom or ghost field. In this letter we show that for w < —1, 
single scalar field models of the dark energy generally have a wrong sign gradient kinetic 
term for fluctuations about the homogeneous background. This result is not dependent on 
general relativistic effects and survives in the fiat-space limit. Spatial inhomogeneities of the 
dark energy are tightly constrained by observations of the cosmic microwave background. 

In our analysis we will assume a time-dependent but spatially homogenous scalar back- 
ground, and show that for w < —1 spatial instabilities inevitably arise. Consider the 
low-energy effective theory of a scalar field coupled to gravity: 

S = j <fx^[MliR + P + U R + V R^"'{d^(^){d^(t))+ ■■■] , (5.8) 

where P, U, and V are functions of the scalar field (p and its derivatives. (Because of the 
anti-symmetry of Rf^^P" in its first two and also in its last two indices, no non-vanishing 
invariant can be formed from it using first derivatives of (p.) Naively we might expect that 
the higher-dimensional couplings of to the Ricci tensor would be suppressed by powers 
of the Planck mass Mpi, making them irrelevant for cosmology after the Planck epoch. 
However, such terms are generated by graphs such as that in Figure ESI Writing the metric 
as g^^^ = r]P^ + h^"" /Mpi, we see that scalar-graviton interactions in Feynman diagrams are 
suppressed by the Planck mass, but when these interactions are reassembled into the Ricci 
tensor that suppression is absent. That is, the higher-dimensional terms in Eq. 1)5. 8|) will 
appear suppressed only by powers of the characteristic energy scale of the scalar field, M, 
which may be much smaller than Mpi. 

We neglect terms in the action ()5.8|) that involve higher powers of the Ricci tensor. The 
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Figure 5.3: The effective couplings of two gravitons to several quanta of the scalar field. The shaded 
region represents interactions involving only scalars. 

terms we consider are ones that can generate contributions to the stress -energy tensor Ty^y 
that are not suppressed by powers of Mpi. S ince T^i, is obtained, by varying the action with 
respect to the metric, terms with more than one power of W'^P'^ yield contributions that 
are themselves proportional to the Ricci tensor and therefore vanish in the flat-space limit. 

Assuming a spatially homogeneous background, only the time derivatives of ^ will be 
non- vanishing in Eq. (|5.8() . It may be shown that in the limit Mpi oo, the term 
{d^4'){du(f))V contributes a term to the stress-energy tensor, which can be reproduced 
by an appropriate change in the function U. Therefore we may restrict ourselves to y = 
and consider the most general U in order to analyze the flat-space behavior of Eq. 1)5. 8|) . 

It is always possible to perform a rescaling of the metric in Eq. ()5.8|) g^i, e'^^g^^, 
with w = log[l -|- U /Mpij, so that the U term in Eq. 1)5. 8() disappears, being absorbed into a 
redefinition of the P action for the ghost scalar field. (See, for example, |lU4j .') The action 
resulting from this rescaling, up to terms suppressed by powers of 1/Mpi, is then 



The most general Lorentz-invariant scalar Lagrangian without higher-derivative terms 
(which we will consider later) is 



where X = g^^ d^(j) dycf). (A potential term V would be the component of P{X,(j)) that 
is independent of X.) Henceforth, P'{X,(j)) will denote differentiation with respect to X. 
Since the scalar field (p is minimally coupled to gravity in Eq. (|5.9|) , the stress-energy tensor 




(5.9) 




(5.10) 
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is 

T^, = -Cg^, + 2P'{X,ct>)d^ct)d,(t> , (5.11) 

and 

Too + 2,^.2 Too 

For (j) to account for the dark energy, we must have Too > 0. Then, w < —1 requires 
that P'{X,(/)) < 0. Let cpo = (poit) be a solution to the equations of motion, and consider 
the fluctuations about this solution: (j) = (pQ + 7r(x,t). When expanded in vr, the effective 
Lagrangian will contain a term 

C = -P'(X,,/.)|V7r|2 + ••• , (5.13) 

which implies that for P'{X, (p) < there will exist field configurations with non-zero spatial 
gradients that have lower energy than the homogeneous configuration.^ There is no direct 
connection between the sign of w + 1 and that of the tt^ term in the effective Lagrangian. 

If P'{X,(j)) is negative, a finite expectation value for the gradients may be obtained 
by adding higher powers of (VTr)^ to the vr Lagrangian, but this is problematic because it 
gives rise to a spatially inhomogeneous ground state for the dark energy and would lead 
to inhomogeneities far larger than the limit of 10^^ imposed by observations of the cosmic 
microwave background. ^ While a potential term such as m?(jp' tends to confine the gradients 
to regions of size 1/m, in most models of the dark energy V"{(1)) must be small enough that 
these regions are of cosmological size. 

In the w < —1 case, it is possible, by adding higher-derivative terms to the Lagrangian, 
to avoid having finite spatial gradients lower the energy of the field. Consider, for example, 

C = P{X,(P) + S{X,(P){U<P? (5.14) 

in which case 

Too = -C + 2[P'{X, 0)</.2 + S'{X, 4>)^'^{d^4>? + 2S{X, (l))4){d^ (t>) - dQ{^Sd^4>)] ■ (5.15) 

^Here we mean energy constructed from the Hamiltonian for fluctations about the background field 
configuration. 

^ A condensate of gradients with a preferred magnitude, determined by the higher-order terms that stabi- 
lize Eq. 15.1311 . will spontaneously break the 0(3) rotational symmetry down to 0{2). The homotopy group 
7r2[0(3)/0(2)] is non-trivial, which leads to the formation of global monopole (hedgehog) configurations. 
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Setting the spatial gradients of (p to zero, we have that 

4>^Cgrad - 2do{S4>)^ = ^^^^ Too , (5.16) 

where Cgrad is the coefficient of — (Vtt)^ in the n Lagrangian. If do{S(j))(j) > 0, then a model 
may have both Cgrad > and w < —1. But for w significantly less than —1, this also 
requires cj)"^ to be at least of order unless ^(X, </>) is made unnaturally large. It is 

not clear how to treat these higher-derivative terms self-consistently beyond perturbation 
theory, so the dynamics of such models cannot be analyzed in a straightforward manner. 
The models we consider below have higher powers of first derivatives, but they satisfy the 
condition that (j? <C {<j)M)'^. 

Our analysis shows that w < —1 scalar models typically require a wrong sign (Vvr)^ term 
in the effective Lagrangian. Previous analyses of ghost models f fSUl I1U5| ) have focused on 
the problems associated with negative energy, particularly with a kinetic term C = — {d^4>)'^ 
that has the wrong sign for both the time and space derivatives. The classical equations 
of motion for such models do not exhibit growing modes of non-zero spatial gradients, 
although the energy of the field is unbounded from below. Models with w < —1 that do not 
have a wrong sign time-derivative kinetic term in the effective Lagrangian can result from a 
Lorentz-invariant action, as we demonstrate below. However, both Lorentz invariance and 
time translation invariance are spontaneously broken by a time-dependent condensate. 

In |SJ a model with C = P{X) was proposed in which a ghost field has a time-dependent 
condensate (from now on we take the Lagrangian to be a function of X only, and therefore 
invariant under the shift (j) ~^ (j) + c). We use units in which the dimensional scale M of 
the model is unity (M ~ 10~^ eV if the ghost comprises the dark energy). The flat-space 
equation motion is 

[p'{x)d''4>] = . (5.17) 

Homogeneous solutions of the equations of motion with c/P = cP were considered in |81j . 
In general, the existence of a </> condensate allows for exotic equations of state, including 
w < —1. In what follows we let 



P{X) = -l + 2{X -if + {X - if , 



(5.18) 
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which leads to it; < —1 with Tqo > when X < 1. 
The energy density is given by 

B£. ■ 

ToQ=n = -j(p-C = 24>^P'{X)-C , (5.19) 
ocp 

which is not necessarily minimized by a particular ghost condensate (j) = ct, although it is 
a solution to the flat-space equations of motion for any value of c. This is possible because 
there is a conserved charge associated with the shift symmetry, 

Q = J d^x P'{X)^ , (5.20) 

so configurations that do not extremize Tqo can still be stable. In fact, the Lagrangian 
describing small fluctuations has the correct sign of vr^ if P'{X) + 2XP"{X) > 0. This 
condition is satisfied in the region X < 1 by (|5.18p given above. There is then a local 
instability to the formation of gradients, as required by our earlier results. 



5.3 Time evolution of w for ghost models of the dark energy 

Ghost models of the dark energy that approach w = —1 asymptotically make potentially 
interesting predictions for the time evolution of the equation of state for the dark energy. 
In a FRW universe, the equation of motion for the ghost field is 

d^[a\t)P'iX)d^'cl)] =0 , (5.21) 

where a{t) is the FRW scale factor. If there is a value cl = = X such that P'(c^) = 0, 
then Eq. (|5.12j) implies that w = —1 when X = c^. The model described by Eq. (|5.18jl has 
cl = 1, and if we apply Eq. 1)5. 21() to it, we see that if we start from X = cf with q close 
to c=K, then we are driven asymptotically towards X = and w = —1. 

In the model described by Eq. (|5.18() . we may be driven towards w = —1 either from 
above or from below, depending on whether we chose to start from c? > 1 or from cf < 1. 
We have argued that < — 1 is problematic because of spatial gradient instabilities, so that 
the case in which we are driven to w = —1 from above is more interesting. 
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Near the asymptotic value = 1 we have 



vr = 



2P"{cl)cl 




(5.22) 



Thus, in this regime, 



w = —1 — 



P{cl) 



= -1 - 



P{cl) \l + z) 



(5.23) 



Equation (|5.23|) offers a prediction for the w parameter of the dark energy as a function of 
the redshift z, which could be tested by cosmological data. 

In summary, from Eqs. (|5.12|) and (|5.13|) we find that in single scalar field models 
of the dark energy with tt; < — 1, the kinetic term for fluctuations about the homogeneous 
background has a wrong sign gradient term. On the other hand, there is no direct connection 
between the sign of the vr^ kinetic term in the effective Hamiltonian and the sign oi w + 1. 

5.4 Anthropic distribution for cosmological constant and pri- 
mordial density perturbations 

The anthropic principle has been proposed as a possible solution to the two cosmological 
constant problems: why the cosmological constant A is orders of magnitude smaller than 
any theoretical expectation, and why it is non-zero and comparable today to the energy 
density in other forms of matter ( |in6|ri?)7|lin8j ). This anthropic argument, which predates 
direct cosmological evidence of the dark energy, is the only theoretical prediction for a small, 
non-zero A ( |1U81 llU9j ) . It is based on the observation that the existence of life capable of 
measuring A requires a universe with cosmological structures such as galaxies or clusters of 
stars. A universe with too large a cosmological constant either doesn't develop any structure, 
since perturbations that could lead to clustering have not gone non-linear before the universe 
becomes dominated by A, or else has a very low probability of exhibiting structure-forming 
perturbations, because such perturbations would have to be so large that they would lie 
in the far tail-end of the cosmic variance. The existence of the string theory landscape, 
in which causally disconnected regions can have different cosmological and particle physics 
properties, adds support to the notion of an anthropic rule for selecting a vacuum. 
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How well does this principle explain the observed value of A in our universe? Careful 
analysis by |lU9j finds that 5% to 12% of universes would have a cosmological constant 
smaller than our own. In everyday experience we encounter events at this level of confi- 
dence,^ so as an explanation this is not unreasonable. 

If the value of A is not fixed a priori, then one might expect other fundamental pa- 
rameters to vary between universes as well. This is the case if one sums over wormhole 
configurations in the path integral for quantum gravity as well as in the string 

theory landscape ( |11H I112| I113| I114j ) . In |114j it was emphasized that all the parameters 
of the low energy theory would vary over the space of vacua ("the landscape"). Douglas 
has initiated a program to quantify the statistical properties of these vacua, with 
additional contributions by others f jll3j ). 

In |115j , Aguirre stressed that life might be possible in universes for which some of the 
cosmological parameters are orders of magnitude different from those of our own universe. 
The point is that large changes in one parameter can be compensated by changes in another 
in such a way that life remains possible. Anthropic predictions for a particular parameter 
value will therefore be weakened if other parameters are allowed to vary between universes. 
One cosmological parameter that may significantly affect the anthropic argument is the 
standard deviation of the amplitude of primordial cosmological density perturbations. Rees 
in |116j and Tegmark and Rees in |117j have pointed out that if the anthropic argument is 
applied to universes where Q is not fixed but randomly distributed, then our own universe 
becomes less likely because universes with both A and Q larger than our own are anthropi- 
cally allowed. The purpose of the work in this section is to quantify this expectation within 
a broad class of inflationary models. Restrictions on the a prori probability distribution 
for Q necessary for obtaining a successful anthropic prediction for A, were considered in 

[nsiiiini- 

In our analysis we let both A and Q vary between universes and then quantify the 
anthropic likelihood of a positive cosmological constant less than or equal to that observed 
in our own universe. We offer a class of toy inflationary models that allow us to restrict the 
a priori probability distribution for Q, making only modest assumptions about the behavior 
of the a priori distribution for the parameter of the inflaton potential in the anthropically 
allowed range. Cosmological and particle physics parameters other than A and Q are held 
^For instance, drawing two pairs in a poker hand. 



80 

fixed as initial conditions at recombination. We provisionally adopt Tegmark and Rees's 
anthropic bound on Q: a factor of 10 above and below the value measured in our universe. 
Even though this interval is small, we find that the likelihood that our universe has a typical 
cosmological constant is drastically reduced. The likelihood tends to decrease further if 
larger intervals are considered. 

Weinberg determined in |108j that, in order for an overdense region to go non-linear be- 
fore the energy density of the universe becomes dominated by A, the value of the overdensity 
5 = 5p/ p must satisfy 

/729A\^/^ 



In a matter-dominated universe this relation has no explicit time dependence. Here p is 
the energy density in non-relativistic matter. Perturbations not satisfying the bound cease 
to grow once the universe becomes dominated by the cosmological constant. For a fixed 
amplitude of perturbations, this observation provides an upper bound on the cosmological 
constant compatible with the formation of structure. Throughout our analysis we assume 
that at recombination A ^ p. 

To quantify whether our universe is a typical, anthropically allowed universe, additional 
assumptions about the distribution of cosmological parameters and the spectrum of density 
perturbations across the ensemble of universes are needed. 

A given slow-roll inflationary model with reheating leads to a Friedman-Robertson- 
Walker universe with a (late-time) cosmological constant A and a spectrum of perturbations 
that is approximately scale-invariant and Gaussian with a variance 

= {S^)hc ■ (5.25) 

The expectation value is computed using the ground state in the inflationary era and per- 
turbations are evaluated at horizon-crossing. The variance is fixed by the parameters of 
the inflationary model together with some initial conditions. Typically, for single- field (j) 
slow-roll inflationary models, 

(5.26) 



Y HC 

This leads to spatially separated over- or underdense regions with an amplitude 5 that for 
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a scale-invariant spectrum are distributed (at recombination) according to 

Mia,d) = \f^-e-''/'-' . (5.27) 
\ TT a 

(The linear relation between Q and the filtered a in Eq. ()5.27() is discussed below.) 

By Bayes's theorem, the probability for an anthropically allowed universe (i.e., the 
probability that the cosmological parameters should take certain values, given that life 
has evolved to measure them) is proportional to the product of the a priori probability 
distribution P for the cosmological parameters, times the probability that intelligent life 
would evolve given that choice of parameter values. Following |in9j . we estimate that 
second factor as being proportional to the mean fraction J^{(t, A) of matter that collapses 
into galaxies. The latter is obtained in a universe with cosmological parameters A and a 
by spatially averaging over all over- or underdense regions, so that ( |in9j ) 

J^{a,A)= d5N{a,5)T{6,A) . (5.28) 

The lower limit of integration is provided by the anthropic bound of Eq. I|5.24|) , which gives 
(5,„,„ = (729A/500p)^/'^. The anthropic probability distribution is 



V{a, A) = P(A, a)J^{a, A)dA da . (5.29) 

Computing the mean fraction of matter collapsed into structures requires a model for 
the growth and collapse of inhomogeneities. The Gunn-Gott model ( |12U1 IT^ ^ describes 
the growth and collapse of an overdense spherical region surrounded by a compensating 
underdense shell. The weighting function J- [5, A) gives the fraction of mass in the inhomo- 
geneous region of density contrast 5 that eventually collapses (and then forms galaxies). To 
a good approximation it is given by f |lU9j 'l 

Additional model dependence occurs in the introduction of the parameter s given by the 
ratio of the volume of overdense sphere to the volume of the underdense shell surrounding 
the sphere. We will set s = 1 throughout. 
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Since the anthropically allowed values for A are so much smaller than any other mass 
scale in particle physics, and since we assume that A = is not a special point in the 
landscape, we follow jl22l llL)9j in using the approximation P(A) ~ P(A = 0) for A within 
the anthropically allowed window.^ The requirement that the universe not recollapse before 
intelligent life has had time to evolve anthropically rules out large negative A ( |1()7[ I124j ) . 
We will assume that the anthropic cutoff for negative A is close enough to A = that all 
A < may be ignored in our calculations. 

As an example of a concrete model for the variation in Q between universes, we consider 
inflaton potentials of the form (see, for example, ^125 ) 



where p is a positive integer. We assume there are additional couplings that provide 
an efficient reheating mechanism, but are unimportant for the evolution of (f) during the 
inflationary epoch. The standard deviation of the amplitude of perturbations gives 



where A is a constant, and (j)Hc is the value of the field when the mode of wave number 
k leaves the horizon. This (j)Hc has logarithmic dependence on A and k, which we neglect. 
Randomness in the initial value for (jj affects only those modes that are (exponentially) well 
outside our horizon. Throughout this section, we will set the spectral index to 1 and ignore 
its running. Equation (|5.32)) then gives Xcc Q^. 

Next, suppose that the fundamental parameters of the Lagrangrian are not fixed, but 
vary between universes, as might be expected if one sums over wormhole configurations in 
the path integral for quantum gravity f [llUp or in the string theory landscape f ITT^ 
I114| 111,'?] ). To obtain the correct normalization for the density perturbations observed in 
our universe, the self-coupling must be extremely small. As the standard deviation Q will 
be allowed to vary by an order of magnitude around 10~^, for this model the self-coupling 
in alternate universes will be very small as well. 

^Garriga and Vilenkin point to examples of quintessence models in which the approximation 
P(A) ~ P(A = 0) in the anthropically allowed range is not valid 11231 . 

^Recent analysis of astronomical data disfavors the Acjf>* inflationary model (' |12ti) '). but for generality we 
will consider an arbitrary p in Eq. 115.3111 . 



F = A + A(/)2p 




(5.32) 
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We may then perform an expansion about A = for the a priori probabihty distribu- 
tion of A. The smallness of A suggests that we may keep only the leading term in that 
expansion. If the a priori probability distribution extends to negative values of A (which 
are anthropically excluded due to the instability of the resulting action for (p), we expect it 
to be smooth near A = 0, and the leading term in the power series expansion to be zeroth 
order in A (i.e., a constant). Therefore we expect a flat a priori probability distribution for 
A. The a priori probability distribution for Q is then 

P{Q) « ^ ~ Q ' (5-33) 

where the normalization constant is determined by the range of integration in Q. Note that 
this distribution favors large Q. On the other hand, if the a priori probability distribution 
for the coupling A only has support for A > then A = is a special point and we cannot 
argue that P{Q) oc Q. However, since the anthropically allowed values of A are very small, 
the a priori distribution for A should be dominated, in the anthropically allowed window, 
by a leading term such as P{X) ~ A'^. Normalizability requires q > —1. Using A oc Q^, this 
gives P{Q) ~ Q2g+i_ 

Before proceeding, it is convenient to transform to the new variables: 

y = — ; a = a(^]' . (5.34) 



P* \P 

Here p is the energy density in non-relativistic matter at recombination, which we take 
to be fixed in all universes, and p=K is the value for the present-day energy density of 
non-relativistic matter in our own universe. For a matter-dominated universe a is time- 
independent, whereas y is constant for any era. Here and throughout this section, a subscript 
* denotes the value that is observed in our universe for the corresponding quantity. The 
only quantities whose variation from universe to universe we will consider are y and a. 

In terms of these variables and following |lU9j , the probability distribution of Eq. 1)5. 29() 
is found to be 



r = NdadyPia) dx^^j^^-^^ , (5.35) 

where 
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and is the normalization constant. 

Notice that, since x > (3, large (3 implies that V ~ e^^ <^ 1. For a fixed a, large y 
implies large (3. Thus, for fixed a, large cosmological constants are anthropically disfavored. 
But if fj is allowed to increase, then /? ~ 0{1) may be maintained at larger y. Garriga and 
Vilenkin have pointed out that the distribution in Eq. I|5.35|) may be rewritten using the 
change of variables ^ (o', /?) f jll9j 'l. The Jacobian for that transformation is a 

function only of a. Equation (|5.35jl then factorizes into two parts: one depending only on 
(7, the other only on (3. Integration over a produces an overall multiplicative factor that 
cancels out after normalization, so that any choice of P{^) will give the same distribution 
for the dimensionless parameter /?. In that sense, even in a scenario where a is randomly 
distributed, the computation in |in9j may be seen as an anthropic prediction for The 
measured value of f3 is, indeed, typical of anthropically allowed universes, but an anthropic 
explanation for (3 alone does not address the problem of why both A and Q should be so 
small in our universe. 

Implementing the anthropic principle requires making an assumption about the mini- 
mum mass of "stuff" collapsed into stars, galaxies, or clusters of galaxies that is needed for 
the formation of life. It is more convenient to express the minimum mass -/Vf„i„ in terms of 
a comoving scale R: iVf„i„ = A-Kpa^^R^ /3 (by convention a = 1 today, so i? is a physical 
scale). We do not know the precise value of R. A better understanding of biology would 
in principle determine its value, which should only depend on chemistry, the fraction of 
matter in the form of baryons, and Newton's constant. In our analysis these are all fixed 
initial conditions at recombination. In particular, we would not expect M„i„ to depend on 
A or Q7 Therefore, even though the relation between M,„„ and R depends on present-day 
cosmological parameters, the value of this threshold will be constant between universes be- 
cause it depends only on parameters that we are treating as fixed initial conditions. Thus, 
in computing the probability distribution over universes, we will fix R. Since we don't know 
what is the correct anthropic value for R, we will present our results for both R=l and 2 
Mpc. {R on the order of a few Mpc corresponds to requiring that structures as large as our 
galaxy be necessary for life.) 

We then proceed to filter out perturbations with wavelength smaller than R, leading to 
®We thank Garriga and Vilenkin for explaining this point to us. 

'^Note, however, that requiring life to last for billions of years (long enough for it to develop intelligence 
and the ability to do astronomy) might place bounds on Q. See |117| . 
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a variance cr^ that depends on the filtering scale. Expressed in terms of the power spectrum 
evaluated at recombination, 

oo 

dkk^P{k)W^{kR) (5.37) 

where W is the filter function, which we take to be a Gaussian W{x) = e"^^/^. P{k) is 
the power spectrum, which we assume to be scale-invariant. (For P{k) we use Eq. (39) of 
dnni, setting n = 1). 

Evaluating (|5.37|) at recombination gives, for our universe, 

(T* = CifQn: . (5.38) 

The number C^, contains the growth factor and transfer function evaluated from horizon 
crossing to recombination and only depends on physics from that era. We assume A is 
small enough so that at recombination it can be ignored and thus we take the variation in 
a between universes to come solely from its explicit dependence on Q. 

We may then use observations of and o"* to determine a = C^Q, valid for all universes. 
We use the explicit expression for C^, that is obtained from Eqs. (39)-(43) and (48)-(51) 
in jlU9j . This takes as inputs the Hubble parameter Hq = 100/i^,km/s, the energy density 
in non-relativistic matter $7*, the cosmological constant = 1 — 0*, the baryon fraction 
Q^h = 0.023/i~^, the smoothing scale R, and the COBE-normalized amplitude of fluctuations 
at horizon crossing, = 1.94 x io-5j7--785-o.05*inn. ^ 

As we have argued, the dependence of C* on the cosmological constant is not relevant 
for our purposes. For our calculations we use 0^, = 0.134/i~^, and /i* = 0.73, consistent 
with their observed best-fit values ( jl27j ). The smoothing scale R will be taken to be either 
1 Mpc or 2 Mpc, and the corresponding values for are 5.2 • 10^ and 3.8-10^. 

The values chosen for the range of Q are motivated by the discussion in |117j about 
anthropic limits on the amplitude of the primordial density perturbations. The authors of 
jll7j argue that Q between 10^^ and 10^^ leads to the formation of numerous supermassive 
blackholes, which might obstruct the emergence of life.^ They then claim that universes 
with Q less than 10^^ are less likely to form stars, or if star clusters do form, that they would 

®They also note that for Q > 10~^ formation of life is possible, but planetary disruptions caused by flybys 
may make it unlikely for planetary life to last billions of years. 
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P{Q) cx in the range 


P{Q) oc Q in the range 


Qjio <Q < log* 


Q,/15 <Q< 15Q, 


QJIO < Q < lOQ, 


Q,/15 <Q< 15Q, 


R=l Mpc 


R = 2 Mpc 


R = 1 Mpc 


i? = 2 Mpc 


i? = 1 Mpc 


i? = 2 Mpc 


i? = 1 Mpc 


i? = 2 Mpc 


P{y < y*) 


1 -10-^ 


3 -10-^ 


4-10-'* 


1 -10-^ 


5 •10-'' 


1 -10-^ 


1 -lo-* 


4 -10-* 


{y)/y* 


1 •10'^ 


4 -10^ 


4 -10* 


1 •lO'i 


1 -10^ 


5 -10^ 


4 •10'' 


2 •10'' 


y5%/y* 


9 -10 


4 -10 


3-10^ 


1 -10^ 


2 -10^ 


7-10 


6-10^ 


2 -10^ 



Table 5.1: Anthropically determined properties of the cosmological constant. 



not be bound strongly enough to retain supernova remnants. Since there is considerable 
uncertainty in these limits, we carry out calculations using both the range indicated by 
|117j as well as a range that is somewhat broader.^ 

Previous work on applying the anthropic principle to variable A and Q has assumed a 
priori distributions P{Q) that fall off as for large Q, with /c > 3 jll8l I119j . Such 

distributions were chosen in order to keep the anthropic probability V{y, Q) normalizable, 
and they usually yield anthropic predictions for the cosmological constant similar to those 
that were obtained in |lU9j by fixing Q to its observed value, because they naturally favor 
a Q as small as its observed value in our universe. For instance, for P{Q) oc l/Q^ in the 
range QJW < Q < lOQ*, P{y < y,) = 5% for i? = 1 Mpc, while P{y < y^) = 7% for 
R = 2 Mpc.) 

However, if we accept the argument of Tegmark and Rees in jll7j that there are natural 
anthropic cutoffs on Q, it follows that the behavior of P{Q) at large Q is irrelevant to the 
normalizability of V{y,Q). Furthermore, P{Q) ~ in the neighborhood of Q = for 

A; > 1 leads to an unnormalizable distribution, since the integral J P{Q)dQ blows up. In 
what follows we shall consider two a priori distributions: P{Q) oc Q, and P{Q) oc l/Q^'^ 
inside the anthropic window, motivated by the inflationary models we have discussed. 

The results are summarized in Table l5?Tl where P{y < y^) is the anthropic probabil- 
ity that the value y be no greater than what is observed in our own universe, {y) is the 
anthropically weighed mean value of y, and 2/5% is the value of y such that the anthropic 
probability of obtaining a value no greater than that is 5%. 

By comparison, for this choice of cosmological parameters, the authors of |lf)9j find that, 
for Q fixed (or measured), the probability of a universe having a cosmological constant no 

^Notice that we are using the ranges indicated in |117| as absolute anthropic cutoffs. Arguments hke 
those made in 11171 introduce some correction to the approximation made in |109| that the probabihty of 
hfe is proportional to the amount of matter that collapses into compact structures. Since we are largely 
ignorant of what the form of this correction is, we have approximated it as a simple window function. 
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P[Q) oc in the range 


P{Q) oc Q in the range 


Q,/10 <Q< lOQ, 


0,/15 <Q< 


Q,/10 < Q < lOQ, 


Q,/15 < Q < 15Q, 


R = 1 Mpc 


R = 2 Mpc 


R = 1 Mpc 


i? = 2 Mpc 


i? = 1 Mpc 


R^2 Mpc 


i? = 1 Mpc 


R = 2 Mpc 


P{Q < Q.) 


8 -lO-* 


8 -lO-* 


2 •10^'' 


2 -lO--* 


1 -10-^ 


1 -10-^ 


1 •10-*' 


1 •10-*' 


{Q)/Q* 


8 


8 


11 


11 


8 


8 


13 


13 


QsrJ Q* 


4 


4 


6 


6 


5 


5 


8 


8 



Table 5.2: Anthropically determined properties of the ampUtude for density pertubations. 



greater than our own is much higher: P{y < 0.7/0.3) = .05 and 0.1, for R = I Mpc and 
R = 2 Mpc, respectively."'^'^ 

One can also ask what is the probability of observing a value for Q in the range (5*/10 < 
Q < Q^, after averaging over all possible cosmological constants. Table summarizes the 
resulting distribution in Q. 

In summary, inflation and a landscape of anthropically determined coupling constants 
provides (in some scenarios) a conceptually clean framework for variation between universes 
in the magnitude of Q. Since increasing Q allows the probability of structure to remain 
non-negligible for A considerably larger than in our own universe, anthropic solutions to 
the cosmological constant problem are weakened by allowing Q as well as A to vary from 
one universe to another. 



These numbers are taken from Table 1 in the published version of |109| . 
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Chapter 6 

The reverse sprinkler 

Everything's got a moral, if only you can find it. 

— Lewis Carroll, Alice's Adventures m Wonderland 

This chapter is based largely on jl28j . Some followups that have appeared since the 
publication of that article include |129j and |13n| . 

6.1 Introduction 

In 1985, R. P. Feynman, one of most distinguished theoretical physicists of his time, pub- 
lished a collection of autobiographical anecdotes that attracted much attention on account 
of their humor and outrageousness f |131j l. While describing his time at Princeton as a 
graduate student (1939-1942), Feynman tells the following story f jl32j ): 

There was a problem in a hydrodynamics book,^ that was being discussed by all 
the physics students. The problem is this: You have an S-shaped lawn sprinkler 
. . . and the water squirts out at right angles to the axis and makes it spin in a 
certain direction. Everybody knows which way it goes around; it backs away 
from the outgoing water. Now the question is this: If you . . . put the sprinkler 
completely under water, and sucked the water in . . . which way would it turn? 

^It has not been possible to identify the book to which Feynman was referring. As we shall discuss, 
the matter is treated in Ernst Mach's Mechanik, first published in 1883 (^Zl)- Yet this book is not a 
"hydrodynamics book" and the reverse sprinkler is presented as an example, not a problem. In |147| . 
John Wheeler suggests that the problem occurred to them while discussing a different question in the 
undergraduate mechanics course that Wheeler was teaching and for which Feynman was the grader. 
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Feynman went on to say that many Princeton physicists, when presented with the 
problem, judged the solution to be obvious, only to find that others arrived with equal 
confidence at the opposite answer, or that they had changed their minds by the following 
day. Feynman claims that after a while he finally decided what the answer should be 
and proceeded to test it experimentally by using a very large water bottle, a piece of 
copper tubing, a rubber hose, a cork, and the air pressure supply from the Princeton 
cyclotron laboratory. Instead of attaching a vacuum to suck the water, he applied high air 
pressure inside of the water bottle to push the water out through the sprinkler. According to 
Feynman's account, the experiment initially went well, but after he cranked up the setting 
for the pressure supply, the bottle exploded, and ". . . the whole thing just blew glass and 
water in all directions throughout the laboratory ..." f [133j ). 

Feynman f jl31j ) did not inform the reader what his answer to the reverse sprinkler 
problem was or what the experiment revealed before exploding. Over the years, and partic- 
ularly after Feynman's autobiographical recollections appeared in print, many people have 
offered their analyses, both theoretical and experimental, of this reverse sprinkler problem.^ 
The solutions presented often have been contradictory and the theoretical treatments, even 
when they have been correct, have introduced unnecessary conceptual complications that 
have obscured the basic physics involved. 

All physicists will probably know the frustration of being confronted by an elementary 
question to which they cannot give a ready answer in spite of all the time dedicated to the 
study of the subject, often at a much higher level of sophistication than what the problem 
at hand would seem to require. Our intention is to offer an elementary treatment of this 
problem, which should be accessible to a bright secondary school student who has learned 
basic mechanics and fluid dynamics. We believe that our answer is about as simple as it can 
be made, and we discuss it in light of published theoretical and experimental treatments. 



^In the literature it is more usual to see this problem identified as the "Feynman inverse sprinkler." 
Because the problem did not originate with Feynman and Feynman never published an answer to the 
problem, we have preferred not to attach his name to the sprinkler. Furthermore, even though it is a 
pedantic point, a query of the Oxford English Dictionary suggests that "reverse" (opposite or contrary in 
character, order, or succession) is a more appropriate description than "inverse" (turned up-side down) for 
a sprinkler that sucks water. 
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Figure 6.1: A sprinkler submerged in a tank of water as seen from above. The L-shaped sprinkler 
is closed, and the forces and torques exerted by the water pressure balance each other. 

6.2 Pressure difference and momentum transfer 

Feynman speaks in his memoirs of "an S-shaped lawn sprinkler." It should not be difficult, 
however, to convince yourself that the problem does not depend on the exact shape of the 
sprinkler, and for simplicity we shall refer in our argument to an L-shaped structure. In 
Fig. 16.11 the sprinkler is closed: Water cannot flow into it or out of it. Because the water 
pressure is equal on opposite sides of the sprinkler, it will not turn: there is no net torque 
around the sprinkler pivot. 

Let us imagine that we then remove part of the wall on the right, as pictured in Fig. l6.2[ 
opening the sprinkler to the flow of water. If water is flowing in, then the pressure marked 
P2 must be lower than the pressure Pi, because water flows from higher to lower pressure. 
In both Fig. 16.11 and Fig. 16.21 the pressure Pi acts on the left. But because a piece of the 
sprinkler wall is missing in Fig. 16.21 the relevant pressure on the upper right part of the 
open sprinkler will be P2. It would seem then that the reverse sprinkler should turn toward 
the water, because if P2 is less than Pi, there would be a net force to the right in the upper 
part of the sprinkler, and the resulting torque would make the sprinkler turn clockwise. If 
A is the cross section of the sprinkler intake pipe, this torque-inducing force is A{Pi — P2). 

But we have not taken into account that even though the water hitting the inside wall 
of the sprinkler in Fig. 16.21 has lower pressure, it also has left-pointing momentum. The 
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Figure 6.2: The sprinkler is now open. If water is flowing into it, then the pressures marked Pi 
and P2 must satisfy Pi> P^- 

incoming water transfers that momentum to the sprinkler as it hits the inner wall. This 
momentum transfer would tend to make the sprinkler turn counterclockwise. One of the 
reasons why the reverse sprinkler is a confusing problem is that there are two effects in play, 
each of which, acting on its own, would make the sprinkler turn in opposite directions. The 
problem is to figure out the net result of these two effects. 

How much momentum is being transferred by the incoming water to the inner sprinkler 
wall in Fig. 16.21 ' If water is moving across a pressure gradient, then over a differential time 
dt^ a given "chunk" of water will pass from an area of pressure P to an area of pressure 
P — dP as illustrated in Fig. 16.31 If the water travels down a pipe of cross-section A, 
its momentum gain per unit time is AdP. Therefore, over the entire length of the pipe, 
the water picks up momentum at a rate A{Pi — -P2), where Pi and P2 are the values of 
the pressure at the endpoints of the pipe. (In the language of calculus, A{Pi — P2) is the 
total force that the pressure gradient across the pipe exerts on the water. We obtain it by 
integrating over the differential force AdP.)^ 

^As some readers of |128| pointed out to us f |134l ITHS) '!. this simplified discussion ignores the fact that 
the cross-section of a fluid flow is not in general constant when a pressure gradient exists. For example, for 
an ideal, incompressible fluid the velocity (and therefore, through Bernoulli's equation, also the pressure) 
must be constant inside a pipe of fixed cross-section A. In that case aU of the acceleration of the fluid would 
have to occur outside of the sprinkler tube, as the flow narrows down to a cross-section A. However, if Pi 
is the pressure of the fluid at rest, then A[P-i — P2) is still the correct expression for the rate at which the 
flow is gaining momentum. In fact, the shape of the flow into the reverse sprinkler will not be relevant to 
our discussion at all, as should become more clear from the discussion of conservation of angular momentum 
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Figure 6.3: As water flows down a tube with a pressure gradient, it picks up momentum. 

For steady flow, the rate A(Pi — P2) must be the same rate at which the water is 
transferring momentum to the sprinkler wall in Fig. 16.21 because otherwise the total amount 
of momentum contained in the flow of water would not be constant. Therefore A{Pi — P2) 
is the force that the incoming water exerts on the inner sprinkler wall in Fig. 16.21 by virtue 
of the momentum it has gained in traveling down the intake pipe. 

Because the pressure difference and the momentum transfer effects cancel each other, 
it would seem that the reverse sprinkler would not move at all. Notice, however, that we 
considered the reverse sprinkler only after water was already flowing continuously into it. 
In fact, the sprinkler will turn toward the water initially, because the forces will balance 
only after water has begun to hit the inner wall of the sprinkler, and by then the sprinkler 
will have begun to turn toward the incoming water. That is, initially only the pressure 
difference effect and not the momentum transfer effect is relevant. (As the water flow stops, 
there will be a brief period during which only the momentum transfer and not the pressure 
difference will be acting on the sprinkler, thus producing a momentary torque opposite to 
the one that acted when the water flow was being established.) 

Why can't we similarly "prove" the patently false statement that a non-sucking sprinkler 
submerged in water will not turn as water flows steadily out of it? In that case the water 
is going out and hitting the upper inner wall, not the left inner wall. It exerts a force, but 
that force produces no torque around the pivot. The pressure difference, on the other hand, 

conservation in Section lol 
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(a) (b) 

Figure 6.4: The force that pushes the water must originaUy come from a sohd waU. The force that 
causes the water flow is shown for both the regular and the reverse sprinklers when submerged in a 
tank of water. 

does exert a torque. The pressure in this case has to be higher inside the sprinkler than 
outside it, so the sprinkler turns counterclockwise, as we expect from experience. 

6.3 Conservation of angular momentum 

We have argued that, if we ignore the transient effects from the switching on and switching 
off of the fluid flow, we do not expect the reverse sprinkler to turn at all. A pertinent 
question is why, for the case of the regular sprinkler, the sprinkler- water system clearly 
exhibits no net angular momentum around the pivot (with the angular momentum of the 
outgoing water cancelling the angular momentum of the rotating sprinkler), while for the 
reverse sprinkler the system would appear to have a net angular momentum given by the 
incoming water. The answer lies in the simple observation that if the water in a tank is 
flowing, then something must be pushing it. In the regular sprinkler, there is a high-pressure 
zone near the sprinkler wall next to the pivot, so it is this lower inner wall that is doing the 
original pushing, as shown in Fig. I6.4r al . 

For the reverse sprinkler, the highest pressure is outside the sprinkler, so the pushing 
originally comes from the right wall of the tank in which the whole system sits, as shown 
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Figure 6.5: A tank with an opening on its side will exhibit a flow such that the water will have an 
angular momentum with respect to the tank's bottom, even though there is no external source of 
torque corresponding to the angular momentum. The apparent paradox is resolved by noting that 
the tank bottom offers no inertial point of reference, because the tank is recoiling due to the motion 
of the water. 



in Fig. I6.4r b). The force on the regular sprinkler clearly causes no torque around the pivot, 
while the force on the reverse sprinkler does. That the water should acquire a net angular 
momentum around the sprinkler pivot in the absence of an external torque might seem a 
violation of Newton's laws, but only because we are neglecting the movement of the tank 
itself. Consider a water tank with a hole in its side, such as the one pictured in Fig. 16.51 The 
water acquires a net angular momentum with respect to any point on the tank's bottom, 
but this angular momentum violates no physical laws because the tank is not inertial: It 
recoils as water flows out of it. 

But there is one further complication: In the reverse sprinkler shown in Fig. 16.41 the 
water that has acquired left-pointing momentum from the pushing of the tank wall will 
transfer that momentum back to the tank when it hits the inner sprinkler wall, so that once 
water is flowing steadily into the reverse sprinkler, the tank will stop experiencing a recoil 
force. The situation is analogous to that of a ship inside of which a machine gun is fired, as 
shown in Fig. 16.61 As the bullet is fired, the ship recoils, but when the bullet hits the ship 
wall and becomes embedded in it, the bullet's momentum is transferred to the ship. (We 
assume that the collision of the bullets with the wall is completely inelastic.) 

If the firing rate is very low, the ship periodically acquires a velocity in a direction 
opposite to that of the fired bullet, only to stop when that bullet hits the wall. Thus the 
ship moves by small steps in a direction opposite that of the bullets' flight. As the firing 
rate is increased, eventually one reaches a rate such that the interval between successive 
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Figure 6.6: In this tliouglit experiment, a ship floats in the ocean while a machine gun with variable 
firing rate is placed at one end. Bullets fired from the gun will travel the length of the ship and hit 
the wall on the other side, where they stop. 



bullets being fired is equal to the time it takes for a bullet to travel the length of the ship. If 
the machine gun is set for this exact rate from the beginning, then the ship will move back 
with a constant velocity from the moment that the first bullet is fired (when the ship picks 
up momentum from the recoil) to the moment the last bullet hits the wall (when the ship 
comes to a stop). In between those two events the ship's velocity will not change because 
every firing is simultaneous to the previous bullet hitting the ship wall. 

As the firing rate is made still higher, the ship will again move in steps, because at the 
time that a bullet is being fired, the previous bullet will not have quite made it to the ship 
wall. Eventually, when the rate of firing is twice the inverse of the time it takes for a bullet 
to travel the length of the ship, the motion of the ship will be such that it picks up speed 
upon the first two shots, then moves uniformly until the penultimate bullet hits the wall, 
whereupon the ship loses half its velocity. The ship will finally come to a stop when the last 
bullet has hit the wall. At this point it should be clear how the ship's motion will change 
as we continue to increase the firing rate of the gun.^ 

For the case of continuous flow of water in a tank (rather than a discrete flow of machine 
gun bullets in a ship), there clearly will be no intermediate steps, regardless of the rate of 
flow. Figure 16.71 shows a water tank connected to a shower head. Water flows (with a 

^Two interesting problems for an introductory university-level physics course suggest themselves. One 
is to show that the center of mass of the buUets-and-ship system will not move in the horizontal direction 
regardless of the firing rate, as one expects from momentum conservation. Another would be to analyze this 
problem in the light of Einstein's relativity of simultaneity. 
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tank 




Figure 6.7: A water tank is connected to a shower head, so that water flows out. Water in the pipe 
that connects the points marked A and B has a right-pointing momentum, but as long as that pipe 
is completely filled with water there is no net horizontal force on the tank. 

consequent linear and angular momentum) between the points marked A and B, before 
exiting via the shower head. When the faucet valve is opened, the tank will experience a 
recoil from the outgoing water, until the water reaches B and begins exiting through the 
shower head, at which point the forces on the tank will balance. By then the tank will have 
acquired a left-pointing momentum. It will lose that momentum as the valve is closed or 
the water tank becomes empty, when there is no longer water flowing away from A but a 
flow is still impinging on B. 

A. K. Schultz f |136j l argues that, at each instant, the water flowing into the reverse 
sprinkler's intake carries a constant angular momentum around the sprinkler pivot, and if 
the sprinkler could turn without any resistance (either from the friction of the pivot or the 
viscosity of the fluid) this angular momentum would be counterbalanced by the angular 
momentum that the sprinkler picked up as the water flow was being switched on. As the 
fluid flow is switched off, such an ideal sprinkler would then lose its angular momentum and 
come to a halt. At every instant, the angular momentum of the sprinkler plus the incoming 
water would be zero. 

Schultz's discussion is correct: In the absence of any resistance, the sprinkler arm itself 
moves so as to cancel the momentum of the incoming water, in the same way that the ship 
in Fig. 16.61 moves to cancel the momentum of the flying bullets. Resistance, on the other 
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hand, would imply that some of that momentum is picked up not just by the sprinkler, but 
by the tank as a whole. If we cement the pivot to prevent the sprinkler from turning at all, 
then the tank will pick up all of the momentum that cancels that of the incoming water. 

How does non-ideal fluid behavior affect this analysis? Viscosity, turbulence, and other 
such phenomena all dissipate mechanical energy. Therefore, a non-ideal fluid rushing into 
the reverse sprinkler would acquire less momentum with respect to the pivot, for a given 
pressure difference, than predicted by the analysis we carried out in Section [6.21 Thus the 
pressure-difference effect would outweigh the momentum-transfer effect even in the steady 
state, leading to a small torque on the sprinkler even after the fluid has begun to hit the 
inside wall of the sprinkler. Total angular momentum is conserved because the "missing" 
momentum of the incoming fluid is being transmitted to the surrounding fluid, and finally 
to the tank. 

6.4 History of the reverse sprinkler problem 

The literature on the subject of the reverse sprinkler is abundant and confusing. Ernst 
Mach speaks of "reaction wheels" blowing or sucking air where we have spoken of regular 
or reverse sprinklers respectively ( |l^-i7j ): 

It might be supposed that sucking on the reaction wheels would produce the 
opposite motion to that resulting from blowing. Yet this does not usually take 
place, and the reason is obvious . . . Generally, no perceptible rotation takes place 
on the sucking in of the air ... If ... an elastic ball, which has one escape-tube, be 
attached to the reaction- wheel, in the manner represented in [Fig. EUta)], and 
be alternately squeezed so that the same quantity of air is by turns blown out 
and sucked in, the wheel will continue to revolve rapidly in the same direction 
as it did in the case in which we blew into it. This is partly due to the fact that 
the air sucked into the spokes must participate in the motion of the latter and 
therefore can produce no reactional rotation, but it also results partly from the 
difference of the motion which the air outside the tube assumes in the two cases. 
In blowing, the air flows out in jets, and performs rotations. In sucking, the air 
comes in from all sides, and has no distinct rotation . . . 
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Figure 6.8: Illustrations from Ernst Mach's Mechanik ( [l'67\ ): (a) Figure 153 a in tlie original, (b) 
Figure 154 in the original. (Images in the public domain, copied from the English edition of 1893.) 

Mach appears to base his treatment on the observation that a "reaction wheel" is not 
seen to turn when sucked on. He then sought a theoretical rationale for this observation 
without arriving at one that satisfied him. Thus the bluster about the explanation being 
"obvious," accompanied by the tentative language about how "generally, no perceptible 
rotation takes place" and by the equivocation about how the lack of turning is "partly due" 
to the air "participating in the motion" of the wheel and partly to the air sucked "coming 
in from all sides." 

Mach goes on to say that 

if we perforate the bottom of a hollow cylinder . . . and place the cylinder on 
[a pivot], after the side has been slit and bent in the manner indicated in 
[Fig. I6.8r b)]. the [cylinder] will turn in the direction of the long arrow when 
blown into and in the direction of the short arrow when sucked on. The air, 
here, on entering the cylinder can continue its rotation unimpeded, and this 
motion is accordingly compensated for by a rotation in the opposite direction 



This observation is correct and interesting: It shows that if the incoming water did 
not give up all its angular momentum upon hitting the inner wall of the reverse sprinkler, 
then the device would turn toward the incoming water, as we discussed at the beginning of 



(CSHl). 
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Section IQS 

In his introduction to Mach's Mechanik, mathematician Karl Menger describes it as 
"one of the great scientific achievements of the [nineteenth] century" ( jl39j ). but it seems 
that the passage we have quoted was not well known to the twentieth-century scientists 
who commented publicly on the reverse sprinkler. Feynman ( |131j ^ gave no answer to 
the problem and wrote as if he expected and observed rotation. Some have pointed out, 
however, that the fact that he cranked up the pressure until the bottle exploded suggests 
another explanation: that he expected rotation and didn't see it. This interpretation seems 
to be supported by a recent letter published by E. Creutz, who claims to have been the only 
other person at the Princeton cyclotron when Feynman carried out his experiment Ql29j ). 
Creutz, however, explicitly disclaims any knowledge of what Feynman's own theoretical 
understanding of the problem was. 

In |14()j and jl41j . the authors discuss the problem and claim that no rotation is observed, 
but they pursue the matter no further. In jl42j . it is suggested that students demonstrate as 
an exercise that "the direction of rotation is the same whether the fiow is supplied through 
the hub [of a submerged sprinkler] or withdrawn from the hub," a result that is discounted 
by almost all the rest of the literature. 

Shortly after Feynman's memoirs appeared, A. T. Forrester published a paper in which 
he concluded that if water is sucked out of a tank by a vacuum attached to a sprinkler then 
the sprinkler will not rotate ( |14Hj ). But he also made the strange claim that Feynman's 
original experiment at the Princeton cyclotron, in which he had high air pressure in the 
bottle push the water out, would actually cause the sprinkler to rotate in the direction of 
the incoming water ( |14,Sj ) . An exchange on the issue of conservation of angular momentum 
between Shultz and Forrester appeared shortly thereafter ( |136| IT^ ). The following year 
L. Hsu, a high school student, published an experimental analysis that found no rotation 
of the reverse sprinkler and questioned (quite sensibly) Forrester's claim that pushing the 
water out of the bottle was not equivalent to sucking it out ( |145) ). E. R. Lindgren also 
published an experimental result that supported the claim that the reverse sprinkler did 
not turn (^^). 

^In 11491 . P. Hewitt proposes a physical setup identical to the one shown in Fig. IG-Sf bl. and observes 
that the device turns in opposite directions depending on whether the fluid pours out of or into it. Hewitt's 
discussion seems to ignore the important difference between such a setup and the reverse sprinkler. The 
issue has recently been investigated in |13U| . 
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After Feynman's death, his graduate research advisor, J. A. Wheeler, pubhshed some 
reminiscences of Feynman's Princeton days from which it would appear that Feynman 
observed no motion in the sprinkler before the bottle exploded ("a little tremor as the 
pressure was first applied . . .but as the flow continued there was no reaction") ( jl47l ). In 
1992 the journalist James Gleick published a biography of Feynman in which he states 
that both Feynman and Wheeler "were scrupulous about never revealing the answer to the 
original question" and then claims that Feynman's answer all along was that the sprinkler 
would not turn f jl48j ). The physical justification that Gleick offers for this answer is 
misleading: Gleick echoes one of Mach's comments in |137j that the water entering the 
reverse sprinkler comes in from many directions, unlike the water leaving a regular sprinkler, 
which forms a narrow jet. Although this observation is correct, it is not very relevant to 
the question at hand. 

The most detailed and pertinent work on the subject, both theoretical and experimental, 
was published by Berg, Collier, and Terrell, who claimed that the reverse sprinkler turns 
toward the incoming water ( |150[[T5T] ). Guided by Schultz's arguments about conservation 
of angular momentum ( jl36j ) , the authors offered a somewhat complicated statement of the 
correct observation that the sprinkler picks up a bit of angular momentum before reaching 
a steady state of zero torque once the water is flowing steadily into the sprinkler. When 
the water stops flowing, the sprinkler comes to a halt.^ 

The air-sucking reverse sprinkler at the Edgerton Center at MIT shows no movement at 
all ( |153j ). As in the setups used by Feynman and others, this sprinkler arm is not mounted 
on a true pivot, but rather turns by twisting or bending a flexible tube. Any transient 
torque will therefore cause, at most, a brief shaking of such a device. The University 
of Maryland's Physics Lecture Demonstration Facility offers video evidence of a reverse 
sprinkler, mounted on a true pivot of very low friction, turning slowly toward the incoming 
water f |152j ^. According to R. E. Berg, in this particular setup 

^There are other references in the hterature to the reverse sprinkler. For a rather humorous exchange, see 
|154| and Already in 1990 the American Journal of Physics had received so many conflicting analyses 

of the problem that the editor proposed "a moratorium on publications on Feynman's sprinkler" f )156| '). In 
one of her 1996 columns for Parade Magazine, Marilyn vos Savant, who bills herself as having the highest 
recorded IQ, offered an account of Feynman's experiment that, she claimed, settled that the reverse sprinkler 
does not move ( I157| '). Vos Savant's column emphasized the confusion of Feynman and others when faced 
with the problem, leading a reader to respond with a letter to his local newspaper in which he questioned 
the credibility of physicists who address matters more complicated than lawn sprinklers, such as the origin 
of the universe ((ISSj). 
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while the water is flowing the nozzle rotates at a constant angular speed. This 
would be consistent with conservation of angular momentum except for one 
thing: while the water is flowing into the nozzle, if you reach and stop the 
nozzle rotation it should remain still after you release it. [But, in practice,] after 
[the nozzle] is released it starts to rotate again" ( |162j ). 

This behavior is consistent with non-zero dissipation of kinetic energy in the fluid flow, 
as we have discussed. Angular momentum is conserved, but only after the motion of the 
tank is taken into account.'' An earlier, unpublished treatment of how dissipation causes 
a steady-state torque on the reverse sprinkler is due to Titcomb, Rueckner, and Sokol 
f |ltj3j ^. Rueckner also reports that the behavior of a sprinkler made to suck argon gas 
whose viscosity is adjusted by changing its temperature seems to corroborate that higher 
viscosity leads to a larger steady-state torque. This experiment, however, would need to be 
carried out more carefully to fully confirm this effect experimentally f |164j ). 

6.5 Conclusions 

We have offered an elementary theoretical treatment of the behavior of a reverse sprinkler, 
and concluded that, under idealized conditions, it should experience no torque while fluid 
flows steadily into it, but as the flow commences, it will pick up an angular momentum 
opposite to that of the incoming fluid, which it will give up as the flow ends. However, in 
the presence of viscosity or turbulence, the reverse sprinkler will experience a small torque 
even in steady state, which would cause it to accelerate toward the incoming water. This 
torque is balanced by an opposite torque acting on the surrounding fluid and finally on the 
tank itself. 

Throughout our discussion, our foremost concern was to emphasize physical intuition 
and to make our treatment as simple as it could be made (but not simpler). A question about 
what L. A. Delsasso called, according to Feynman's recollection, "a freshman experiment" 
( |13,Sj ) deserves an answer presented in a language at the corresponding level of complication. 

More important is the principle, famously put forward by Feynman himself when discussing 

'^In the late 1950's and early 1960's, there was some interest in the related physics problem of the so-called 
putt-putt (or pop-pop) boat, a fascinating toy boat that propels itself by heating (usually with a candle) an 
inner tank connected to a submerged double exhaust. Steam bubbles cause water to be alternately blown 
out of and sucked into the tank (' fl59l ITCHl llfcil| 'l. The ship moves forward, much like Mach described the 
"reaction wheel" turning vigorously in one direction as air was alternately blown out and sucked in. 
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the spin statistics theorem, that if we can't "reduce it to the freshman level," we don't 
really understand it f jl65j ). 

We also have commented on the perplexing history of the reverse sprinkler problem, a 
history that is interesting not only because physicists of the stature of Mach, Wheeler, and 
Feynman enter into it, but also because it offers a startling illustration of the fallibility of 
great scientists faced with a question about "a freshman experiment." 
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